<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//TaxonX//DTD Taxonomic Treatment Publishing DTD v0 20100105//EN" "../../nlm/tax-treatment-NS0.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:tp="http://www.plazi.org/taxpub" article-type="research-article" dtd-version="3.0" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">109</journal-id>
      <journal-id journal-id-type="index">urn:lsid:arphahub.com:pub:3dc5f44e-8666-58db-bc76-a455210e8891</journal-id>
      <journal-title-group>
        <journal-title xml:lang="en">JUCS - Journal of Universal Computer Science</journal-title>
        <abbrev-journal-title xml:lang="en">jucs</abbrev-journal-title>
      </journal-title-group>
      <issn pub-type="ppub">0948-695X</issn>
      <issn pub-type="epub">0948-6968</issn>
      <publisher>
        <publisher-name>Journal of Universal Computer Science</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.3217/jucs-015-04-0705</article-id>
      <article-id pub-id-type="publisher-id">29334</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Research Article</subject>
        </subj-group>
        <subj-group subj-group-type="scientific_subject">
          <subject>I.2.6 - Learning</subject>
          <subject>I.5.0 - General</subject>
          <subject>I.5.1 - Models</subject>
          <subject>I.5.3 - Clustering</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>An Efficient Data Preprocessing Procedure for Support Vector Clustering</article-title>
      </title-group>
      <contrib-group content-type="authors">
        <contrib contrib-type="author" corresp="yes">
          <name name-style="western">
            <surname>Wang</surname>
            <given-names>Jeen-Shing</given-names>
          </name>
          <email xlink:type="simple">jeenshin@mail.ncku.edu.tw</email>
          <xref ref-type="aff" rid="A1">1</xref>
        </contrib>
        <contrib contrib-type="author" corresp="no">
          <name name-style="western">
            <surname>Chiang</surname>
            <given-names>Jen-Chieh</given-names>
          </name>
          <xref ref-type="aff" rid="A1">1</xref>
        </contrib>
      </contrib-group>
      <aff id="A1">
        <label>1</label>
        <addr-line content-type="verbatim">National Cheng Kung University, Tainan City, Taiwan</addr-line>
        <institution>National Cheng Kung University</institution>
        <addr-line content-type="city">Tainan City</addr-line>
        <country>Taiwan</country>
      </aff>
      <author-notes>
        <fn fn-type="corresp">
          <p>Corresponding author: Jeen-Shing Wang (<email xlink:type="simple">jeenshin@mail.ncku.edu.tw</email>).</p>
        </fn>
        <fn fn-type="edited-by">
          <p>Academic editor: </p>
        </fn>
      </author-notes>
      <pub-date pub-type="collection">
        <year>2009</year>
      </pub-date>
      <pub-date pub-type="epub">
        <day>28</day>
        <month>02</month>
        <year>2009</year>
      </pub-date>
      <volume>15</volume>
      <issue>4</issue>
      <fpage>705</fpage>
      <lpage>721</lpage>
      <uri content-type="arpha" xlink:href="http://openbiodiv.net/9697A69E-6006-5BA0-9725-AAFFAF955082">9697A69E-6006-5BA0-9725-AAFFAF955082</uri>
      <uri content-type="zenodo_dep_id" xlink:href="https://zenodo.org/record/7000683">7000683</uri>
      <permissions>
        <copyright-statement>Jeen-Shing Wang, Jen-Chieh Chiang</copyright-statement>
        <license license-type="creative-commons-attribution" xlink:href="" xlink:type="simple">
          <license-p>This article is freely available under the J.UCS Open Content License.</license-p>
        </license>
      </permissions>
      <abstract>
        <label>Abstract</label>
        <p>This paper presents an efficient data preprocessing procedure for the of support vector clustering (SVC) to reduce the size of a training dataset. Solving the optimization problem and labeling the data points with cluster labels are time-consuming in the SVC training procedure. This makes using SVC to process large datasets inefficient. We proposed a data preprocessing procedure to solve the problem. The procedure contains a shared nearest neighbor (SNN) algorithm, and utilizes the concept of unit vectors for eliminating insignificant data points from the dataset. Computer simulations have been conducted on artificial and benchmark datasets to demonstrate the effectiveness of the proposed method.</p>
      </abstract>
    </article-meta>
  </front>
</article>
