Latest Articles from JUCS - Journal of Universal Computer Science Latest 100 Articles from JUCS - Journal of Universal Computer Science https://lib.jucs.org/ Fri, 29 Mar 2024 17:24:48 +0200 Pensoft FeedCreator https://lib.jucs.org/i/logo.jpg Latest Articles from JUCS - Journal of Universal Computer Science https://lib.jucs.org/ What is the Consumer Attitude toward Healthcare Services? A Transfer Learning Approach for Detecting Emotions from Consumer Feedback https://lib.jucs.org/article/104093/ JUCS - Journal of Universal Computer Science 30(1): 3-24

DOI: 10.3897/jucs.104093

Authors: Bashar Alshouha, Jesus Serrano-Guerrero, David Elizondo, Francisco P. Romero, Jose A. Olivas

Abstract: The capability of offering patient-centered healthcare services involves knowing the consumer needs. Many of these needs can be conveyed through opinions about services that can be found on social networks. The consumers/patients can express their complains, satisfaction, frustration, etc. in terms of feelings and emotions toward those services; for that reason, it is pivotal to accurately detect them. There are many recent techniques to detect sentiments or emotions, but one of the most promising is transfer learning. This allows adapting a model originally trained for a task to a different one by fine-tuning. Following this idea, the primary objective of this research is to study whether several pre-trained language models can be adapted to a task such as patient emotion detection in an efficient manner. For this purpose, seven clinical and biomedical pre-trained models and four domain-general models have been adapted to detect multiple emotions. These models have been tuned using a dataset consisting of real patient opinions which convey several emotions per opinion. The experiments carried out state the domain-specific pre-trained models outperform the domain-general ones. Particularly, Clinical-Longformer obtained the best scores, 98.18% and 95.82% in terms of accuracy and F1-score, respectively. Analyzing the patient feedback available on social networks may provide valuable knowledge about consumer sentiments and emotions, especially for healthcare managers. This information can be very interesting for purposes such as assessing the quality of healthcare services or designing patient-centered services.

HTML

XML

PDF

]]>
Research Article Sun, 28 Jan 2024 16:00:02 +0200
Retail Indicators Forecasting and Planning https://lib.jucs.org/article/112556/ JUCS - Journal of Universal Computer Science 29(11): 1385-1403

DOI: 10.3897/jucs.112556

Authors: Nelson Baloian, Jonathan Frez, José A. Pino, Cristóbal Fuenzalida, Sergio Peñafiel, Belisario Panay, Gustavo Zurita, Horacio Sanson

Abstract: We present a methodology to handle the problem of planning sales goals. The methodology supports the retail manager to carry out simulations to find the most plausible goals for the future. One of the novel aspects of this methodology is that the analysis is based not on current sales levels, as most previous works do, but on those in the future, making a more precise and accurate analysis of the situation. The work presents the solution for a scenario using three sales performance indicators: foot traffic, conversion rate and ticket mean value for sales, but it explains how it can be generalized to more indicators. The contribution of this work is in the first place a framework, which consists of a methodology for performing sales planning, then, an algorithm, which finds the best prediction model for a particular store, and finally, a tool, which helps sales planners to set realistic sales goals based on the predicted sales. First we present the method to choose the best indicator prediction model for each retail store and then we present a tool which allows the retail manager estimate the improvements on the indicators in order to attain a desired sales goal level; the managers may then perform several simulations for various scenarios in a fast and efficient way. The developed tool implementing this methodology was validated by experts in the subject of administration of retail stores yielding good results.

HTML

XML

PDF

]]>
Research Article Tue, 28 Nov 2023 18:00:08 +0200
Efficiently Finding Cyclical Patterns on Twitter Considering the Inherent Spatio-temporal Attributes of Data https://lib.jucs.org/article/112523/ JUCS - Journal of Universal Computer Science 29(11): 1404-1421

DOI: 10.3897/jucs.112523

Authors: Claudio Gutiérrez-Soto, Patricio Galdames, Daniel Navea

Abstract: Social networks such as Twitter provide thousands of terabytes per day, which can be exploited to find relevant information. This relevant information is used to promote marketing strategies, analyze current political issues, and track market trends, to name a few examples. One instance of relevant information is finding cyclic behavior patterns (i.e., patterns that frequently repeat themselves over time) in the population. Because trending topics on Twitter change rapidly, efficient algorithms are required, especially when considering location and time (i.e., the specific location and time) during broadcasts. This article presents an efficient algorithm based on association rules to find cyclical patterns on Twitter, considering the inherent spatio-temporal attributes of data. Using a Hash Table enhances the efficiency of this algorithm, called HashCycle. Notably, HashCycle does not use minimum support and can detect patterns in a single run over a sequence. The processing times of HashCycle were compared to the Apriori (which is a well-known and widely used on diverse platforms) and Projection-based Partial Periodic Patterns (PPA) algorithms (which is one of the most efficient algorithms in terms of processing times). Empirical results from two spatio-temporal databases (a synthetic data set and one based on Twitter) show that HashCycle has more efficient processing times than two state-of-the-art algorithms: Apriori and PPA.

HTML

XML

PDF

]]>
Research Article Tue, 28 Nov 2023 16:00:09 +0200
Restaurant Recommendations Based on Multi-Criteria Recommendation Algorithm https://lib.jucs.org/article/78240/ JUCS - Journal of Universal Computer Science 29(2): 179-200

DOI: 10.3897/jucs.78240

Authors: Qusai Y. Shambour, Mosleh M. Abualhaj, Ahmad Adel Abu-Shareha

Abstract: Recent years have witnessed a rapid explosion of online information sources about restaurants, and the selection of an appropriate restaurant has become a tedious and time-consuming task. A number of online platforms allow users to share their experiences by rating restaurants based on more than one criterion, such as food, service, and value. For online users who do not have enough information about suitable restaurants, ratings can be decisive factors when choosing a restaurant. Thus, personalized systems such as recommender systems are needed to infer the preferences of each user and then satisfy those preferences. Specifically, multi-criteria recommender systems can utilize the multi-criteria ratings of users to learn their preferences and suggest the most suitable restaurants for them to explore. Accordingly, this paper proposes an effective multi-criteria recommender algorithm for personalized restaurant recommendations. The proposed Hybrid User-Item based Multi-Criteria Collaborative Filtering algorithm exploits users’ and items’ implicit similarities to eliminate the sparseness of rating information. The experimental results based on three real-word datasets demonstrated the validity of the proposed algorithm concerning prediction accuracy, ranking performance, and prediction coverage, specifically, when dealing with extremely sparse datasets, in relation to other baseline CF-based recommendation algorithms.

HTML

XML

PDF

]]>
Research Article Tue, 28 Feb 2023 10:00:05 +0200
Customized Curriculum and Learning Approach Recommendation Techniques in Application of Virtual Reality in Medical Education https://lib.jucs.org/article/94161/ JUCS - Journal of Universal Computer Science 28(9): 949-966

DOI: 10.3897/jucs.94161

Authors: Abhishek Kumar, Abdul Khader Jilani Saudagar, Mohammed AlKhathami, Badr Alsamani, Muhammad Badruddin Khan, Mozaherul Hoque Abul Hasanat, Ankit Kumar

Abstract: Virtual Reality (VR) has made considerable gains in the consumer and professional markets. As VR has progressed as a technology, its overall usefulness for educational purposes has grown. On the other hand, the educational field struggles to keep up with the latest innovations, changing affordances, and pedagogical applications due to the rapid evolution of technology. Therefore, many have elaborated on the potential of virtual reality (VR) in learning. This research proposes a novel techniques customized curriculum for medical students and recommendations for their learning process based on deep learning techniques. Here the data has been collected based on the pre-historic performance of the student and their current requirement and these data have been created as a dataset. Then this has been processed for analysis based on CAD system integrated with deep learning techniques for creating a customized curriculum. Initially this data has been processed and analysed to remove missing and invalid data. Then these data were classified for creation of the curriculum using a gradient decision tree integrated with naïve Bayes. From this, the customized curriculum has been generated. Based on this customized curriculum, the learning approach recommendation has been carried out using the fuzzy rules integrated knowledge-based recommendation system. The experimental results of the proposed technique have been carried out with an accuracy of 98%, specificity of 82%, F-1 score of 79%, information overload of 75%, and precision of 81%.

HTML

XML

PDF

]]>
Research Article Wed, 28 Sep 2022 10:00:00 +0300
Automatic Detection and Recognition of Citrus Fruit & Leaves Diseases for Precision Agriculture https://lib.jucs.org/article/94133/ JUCS - Journal of Universal Computer Science 28(9): 930-948

DOI: 10.3897/jucs.94133

Authors: Ashok Kumar Saini, Roheet Bhatnagar, Devesh Kumar Srivastava

Abstract: Machine learning is a branch of computer science concerned with developing algorithms & models capable of ‘learning through data and iterations’. Deep learning simulates the structure and function of human organs and diseases using artificial neural networks with more than one hidden layer. The primary purpose of this work is to develop and test computer vision and machine learning algorithms for classifying Huanglongbing (HLB)-infected, healthy, and unhealthy leaves and fruits of the citrus plant. The images were segmented using a normalized graph cut, and texture information was extracted using a co-occurrence matrix. The collected attributes were used for classification and support vector machine (SVM), and deep learning methods were employed. When rating the classification outcomes, the accuracy of the classification and the number of false positives and false negatives were considered. The result shows that Deep Learning could create categories up to 96.8% of HLB-infected leaves and fruits. Despite a broad variance in intensity from leaves collected in North India, this method suggests it could be beneficial in diagnosing HLB.

HTML

XML

PDF

]]>
Research Article Wed, 28 Sep 2022 10:00:00 +0300
Affective Knowledge-enhanced Emotion Detection in Arabic Language: A Comparative Study https://lib.jucs.org/article/72590/ JUCS - Journal of Universal Computer Science 28(7): 733-757

DOI: 10.3897/jucs.72590

Authors: Jesus Serrano-Guerrero, Bashar Alshouha, Francisco P. Romero, Jose A. Olivas

Abstract: Online opinions/reviews contain a lot of sentiments and emotions that can be very useful, especially, for Internet suppliers which can know whether their services/products are meeting their customers’ expectations or not. To detect these sentiments and emotions, most applications resort to lexicon-based approaches. The major issue here is that most well-known emotion lexicons have been developed for English language; nevertheless, in other languages such as Arabic, there are fewer available tools, and many times, the quality of them is poor.The goal of this study is to compare the performance of two different types of algorithms, shallow machine learning-based and deep learning-based, when dealing with emotion detection in Arabic language. To improve the performance of the algorithms, two lexicons, which were originally developed in other languages and translated into Arabic language, have been used to add emotional features to different information models used to represent opinions. All approaches have been tested using the dataset SemEval 2018 Task 1: Affect in Tweets and the dataset LAMA+DINA. The semantic approaches outperform the classical algorithms, that is, the information provided by the lexicons clearly improves the results of the algorithms. Particularly, the BiLSTM algorithm outperforms the rest of the algorithms using word2vec. On the contrary to other languages, the best results were obtained using the NRC lexicon.

HTML

XML

PDF

]]>
Research Article Thu, 28 Jul 2022 10:00:00 +0300
Extracting concepts from triadic contexts using Binary Decision Diagram https://lib.jucs.org/article/67953/ JUCS - Journal of Universal Computer Science 28(6): 591-619

DOI: 10.3897/jucs.67953

Authors: Julio Cesar Vale Neves, Luiz Enrique Zarate, Mark Alan Junho Song

Abstract: Due to the high complexity of real problems, a considerable amount of research that deals with high volumes of information has emerged. The literature has considered new applications of data analysis for high dimensional environments in order to manage the difficulty in extracting knowledge from a database, especially with the increase in social and professional networks. Tri- adic Concept Analysis (TCA) is a technique used in the applied mathematical area of data analysis. Its main purpose is to enable knowledge extraction from a context that contains objects, attributes, and conditions in a hierarchical and systematized representation. There are several algorithms that can extract concepts, but they are inefficient when applied to large datasets because the compu- tational costs are exponential. The objective of this paper is to add a new data structure, binary decision diagrams (BDD), in the TRIAS algorithm and retrieve triadic concepts for high dimen- sional contexts. BDD was used to characterize formal contexts, objects, attributes, and conditions. Moreover, to reduce the computational resources needed to manipulate a high-volume of data, the usage of BDD was implemented to simplify and represent data. The results show that this method has a considerably better speedup when compared to the original algorithm. Also, our approach discovered concepts that were previously unachievable when addressing high dimensional contexts.

HTML

XML

PDF

]]>
Research Article Tue, 28 Jun 2022 10:00:00 +0300
Question to Question Similarity Analysis Using Morphological, Syntactic, Semantic, and Lexical Features https://lib.jucs.org/article/24080/ JUCS - Journal of Universal Computer Science 26(6): 671-697

DOI: 10.3897/jucs.2020.036

Authors: Mahmoud Hammad, Mohammad Al-Smadi, Qanita Baker, Muntaha D, Nour Al-Khdour, Mutaz Younes, Enas Khwaileh

Abstract: In the digitally connected world that we are living in, people expect to get answers to their questions spontaneously. This expectation increased the burden on Question/Answer platforms such as Stack Overflow and many others. A promising solution to this problem is to detect if a question being asked is similar to a question in the database, then present the answer of the detected question to the user. To address this challenge, we propose a novel Natural Language Processing (NLP) approach that detects if two Arabic questions are similar or not using their extracted morphological, syntactic, semantic, lexical, overlapping, and semantic lexical features. Our approach involves several phases including Arabic text processing, novel feature extraction, and text classifications. Moreover, we conducted a comparison between seven different machine learning classifiers. The included classifiers are: Support Vector Machine (SVM), Decision Tree (DT), Logistic Regression (LR), Extreme Gradient Boosting (XGB), Random Forests (RF), Adaptive Boosting (AdaBoost), and Multilayer Perceptron (MLP). To conduct our experiments, we used a real-world questions dataset consisting of around 19,136 questions (9,568 pairs of questions) in which our approach achieved 82.93% accuracy using our XGB model on the best features selected by the Random Forest feature selection technique. This high accuracy of our model shows the ability of our approach to correctly detect similar Arabic questions and hence increases user satisfactions.

HTML

XML

PDF

]]>
Research Article Sun, 28 Jun 2020 00:00:00 +0300
Real-Time Bot Detection from Twitter Using the Twitterbot+ Framework https://lib.jucs.org/article/24011/ JUCS - Journal of Universal Computer Science 26(4): 496-507

DOI: 10.3897/jucs.2020.026

Authors: Kheir Daouadi, Rim Rebaï, Ikram Amous

Abstract: Nowadays, bot detection from Twitter attracts the attention of several researchers around the world. Different bot detection approaches have been proposed as a result of these research efforts. Four of the main challenges faced in this context are the diversity of types of content propagated throughout Twitter, the problem inherent to the text, the lack of sufficient labeled datasets and the fact that the current bot detection approaches are not sufficient to detect bot activities accurately. We propose, Twitterbot+, a bot detection system that leveraged a minimal number of language-independent features extracted from one single tweet with temporal enrichment of a previously labeled datasets. We conducted experiments on three benchmark datasets with standard evaluation scenarios, and the achieved results demonstrate the efficiency of Twitterbot+ against the state-of-the-art. This yielded a promising accuracy results (>95%). Our proposition is suitable for accurate and real-time use in a Twitter data collection step as an initial filtering technique to improve the quality of research data.

HTML

XML

PDF

]]>
Research Article Tue, 28 Apr 2020 00:00:00 +0300
Cyberattack Response Model for the Nuclear Regulator in Slovenia https://lib.jucs.org/article/22671/ JUCS - Journal of Universal Computer Science 25(11): 1437-1457

DOI: 10.3217/jucs-025-11-1437

Authors: Samo Tomažič, Igor Bernik

Abstract: Cyberattacks targeting the nuclear sector are now a reality; they are becoming increasingly frequent and sophisticated, while the perpetrators are increasingly motivated. The key stakeholders in the nuclear sector, such as nuclear facility operators, nuclear regulators responsible for nuclear safety or nuclear security, technical support organisations and computer equipment suppliers, must take the necessary cybersecurity measures to prepare for potential cyberattacks and provide the highest possible level of response to such cyberattacks. This can only be achieved by adopting a systematic approach to cyberattack response. When conducting the research study presented herein, a descriptive method was applied to review the scientific literature, various standards, recommendations and guides, as well as to devise an inventory of publicly available sources. On the basis of such an analysis, individual questions were then formulated in order to compile a structured interview, which was conducted with international experts working at nuclear facilities, nuclear regulators, technical support organisations, computer equipment suppliers and other organisations responsible for providing cybersecurity in the nuclear sector. On the basis of their responses, researchers devised an innovative and comprehensive Cyberattack Response Model to be used by Slovenia's nuclear safety regulator and the regulator responsible for the physical protection of nuclear facilities and nuclear and radioactive materials.

HTML

XML

PDF

]]>
Research Article Thu, 28 Nov 2019 00:00:00 +0200
High-Performance Simulation of Drug Release Model Using Finite Element Method with CPU/GPU Platform https://lib.jucs.org/article/22658/ JUCS - Journal of Universal Computer Science 25(10): 1261-1278

DOI: 10.3217/jucs-025-10-1261

Authors: Akhtar Ali, Imran Bajwa, Rafaqat Kazmi

Abstract: his paper describes a hybrid CPU/GPU approach for solving a two-phase mathematical model numerically. The dynamic of drug release between the first phase (coating) and second phase (arterial tissue) is represented by a system of partial differential equations (PDEs). The system of equations is discretized by Finite Element Method. The whole discretized system involves a large sparse system of equation which requires a high computation. The CPU/GPU approach provides a platform to solve PDEs having extensive computations in parallel. Consequently, this platform can significantly reduce the solution times as compared to the implementation of CPU. This allows for more efficient investigation of different mathematical models, as well as, the governing parameters. In this paper, a significant parallel computing framework is presented to solve the governing equations numerically using the Graphics Processing Units (GPUs) with CUDA. This two-phase model investigates the impact of key parameters related to mass concentrations and drug release from tissue and coating layers. The identification and the role of major parameters such as (Filtration velocity, the ratio of accessible void volume to solid volume, the solid-liquid mass transfer rate) are tinted. Furthermore, the motivation and guidance for using parallel computing in order to handle computational complexities and large sparse system arise after discretizing the model equations are explained. We have designed a hybrid CPU/GPU solution of the proposed model by using Matlab. The parallel performance results show that CPU/GPU architecture is more efficient in large-scale problem simulations.

HTML

XML

PDF

]]>
Research Article Mon, 28 Oct 2019 00:00:00 +0200
Determination of System Weaknesses Based on the Analysis of Vulnerability Indexes and the Source Code of Exploits https://lib.jucs.org/article/22645/ JUCS - Journal of Universal Computer Science 25(9): 1043-1065

DOI: 10.3217/jucs-025-09-1043

Authors: Andrey Fedorchenko, Elena Doynikova, Igor Kotenko

Abstract: Currently the problem of monitoring the security of information systems is highly relevant. One of the important security monitoring tasks is to automate the process of determination of the system weaknesses for their further elimination. The paper considers the techniques for analysis of vulnerability indexes and exploit source code, as well as their subsequent classification. The suggested approach uses open security sources and incorporates two techniques, depending on the available security data. The first technique is based on the analysis of publicly available vulnerability indexes of the Common Vulnerability Scoring System for vulnerability classification by weaknesses. The second one complements the first one in case if there are exploits but there are no associated vulnerabilities and therefore the indexes for classification are absent. It is based on the analysis of the exploit source code for the features, i.e. indexes, using graph models. The extracted indexes are further used for weakness determination using the first technique. The paper provides the experiments demonstrating an effectiveness and potential of the developed techniques. The obtained results and the methods for their enhancement are discussed.

HTML

XML

PDF

]]>
Research Article Sat, 28 Sep 2019 00:00:00 +0300
A Model for Resource Management in Smart Cities Based on Crowdsourcing and Gamification https://lib.jucs.org/article/22643/ JUCS - Journal of Universal Computer Science 25(8): 1018-1038

DOI: 10.3217/jucs-025-08-1018

Authors: Rodrigo Barbosa Sounited States Of America Orrego, Jorge Luis Victória Barbosa

Abstract: Resources of a city are urban assets such as hospitals and pharmacies (health facilities) or accessible ramps and adapted toilets (accessibility resources). This paper addresses the problem of resource management for smart cities combining crowdsourcing with gamification, and proposes a model called CORE-MM. This model allows the use of crowdsourcing techniques so that the management of cities resources is done by the citizens, without having to rely on an organization or public administration. To encourage participation in this resource management, this model also uses techniques of gamification. CORE-MM proposes the use of crowdsourcing integrated with gamification to manage the resources of a smart city, with two interdependent objectives: to motivate the use of the system by the users, and to encourage their participation in the sharing and management of information. The scientific contribution of this work is that CORE-MM treats the resource management considering a generic resources approach for smart cities. A prototype of CORE-MM was offered to volunteers and a questionnaire was developed to collect data and to evaluate the model, its performance and relevance. Results with volunteers indicated good perceived ease of use and good perceived utility. From the affirmations of the questionnaire that the 10 volunteers that tested the CORE-MM prototype had to answer, 91.67% agreed on the ease of use of the system and 8.33% manifested indifference in their responses. Regarding the utility of the system, 99.17% agreed and only 0.83% were indifferent. These results point to positive perspectives regarding the use of the application in possible situations and real locations.

HTML

XML

PDF

]]>
Research Article Wed, 28 Aug 2019 00:00:00 +0300
Improving Ontology Matching Using Application Requirements for Segmenting Ontologies https://lib.jucs.org/article/22632/ JUCS - Journal of Universal Computer Science 25(7): 816-839

DOI: 10.3217/jucs-025-07-0816

Authors: Diego Pessoa, Ana Salgado, Bernadette Lóscio

Abstract: Ontology matching is concerned with finding relations between elements of different ontologies. In large-scale settings, some significant challenges arise, such as how to achieve a reduction in the time it takes to perform matching and how to improve the quality of results. Current techniques involve the use of ontology segmentation to overcome having such a large number of elements to compare. However, current methods usually select the most relevant ontology elements based on the number of relationships, which may dismiss some elements should they have fewer or no relationships. Therefore, we propose an algorithm for ontology segmentation based on application requirements, in such a way that the users can specify the concepts that are the most relevant in their application context to generate the segments which will be used as an input for the matching. In the experiments, we found a general reduction in the execution time and some significant quality improvements, depending on what matcher is applied. In order to assess the proposed algorithm, we considered some well-known evaluation measures, such as precision, recall, and F-Measure.

HTML

XML

PDF

]]>
Research Article Sun, 28 Jul 2019 00:00:00 +0300
Survey on Ranking Functions in Keyword Search over Graph-Structured Data https://lib.jucs.org/article/22603/ JUCS - Journal of Universal Computer Science 25(4): 361-389

DOI: 10.3217/jucs-025-04-0361

Authors: Asieh Ghanbarpour, Hassan Naderi

Abstract: Keyword search is known as an attractive alternative for structured query languages in querying over graph-structured data. A keyword query is expressed by a set of keywords and respond by a set of connected structures from the database, which totally or partially cover the queried keywords. These results show how the queried keywords are related in the database. Since there may be numerous results to a given query, a ranking function is essential to present top-k more relevant results to the user. The effectiveness of this function directly affected the effectiveness of the keyword search system. In this paper, we survey the proposed ranking functions in the context of keyword search. First, the proposed models for the results of a keyword query are discussed and a categorization of them is presented. Next, the effective factors in determining the relevance of results are examined. Then, various ranking functions for ordering the results of a query are described and categorized based on their main view in determining the semantic of the results. Finally, we present an analysis of these classes and discuss the evolution of new research strategies to resolve the issues associated with the ranking of results in the keyword search domain.

HTML

XML

PDF

]]>
Research Article Sun, 28 Apr 2019 00:00:00 +0300
Human Language Technologies: Key Issues for Representing Knowledge from Textual Information https://lib.jucs.org/article/23709/ JUCS - Journal of Universal Computer Science 24(11): 1651-1676

DOI: 10.3217/jucs-024-11-1651

Authors: Yoan Gutiérrez, Elena Lloret, José Gómez

Abstract: Ontologies are appropriate structures for capturing and representing the knowledge about a domain or task. However, the design and further population of them are both di_cult tasks, normally addressed in a manual or in a semi-automatic manner. The goal of this article is to de_ne and extend a task-oriented ontology schema that semantically represents the information contained in texts. This information can be extracted using Human Language Technologies, and throughout this work, the whole process to design such ontology schema is described. Then, we also describe an algorithm to automatically populate ontologies based our Human Language Technology oriented schema, avoiding the unnecessary duplication of instances, and having as a result the required information in a more compact and useful format ready to exploit. Tangible results are provided, such as permanent online access points to the ontology schema, an example bucket (i.e. ontology instance repository) based on a real scenario, and a documentation Web page.

HTML

XML

PDF

]]>
Research Article Wed, 28 Nov 2018 00:00:00 +0200
Community Detection Applied on Big Linked Data https://lib.jucs.org/article/23707/ JUCS - Journal of Universal Computer Science 24(11): 1627-1650

DOI: 10.3217/jucs-024-11-1627

Authors: Laura Po, Davide Malvezzi

Abstract: The Linked Open Data (LOD) Cloud has more than tripled its sources in just six years (from 295 sources in 2011 to 1163 datasets in 2017). The actual Web of Data contains more then 150 Billions of triples. We are assisting at a staggering growth in the production and consumption of LOD and the generation of increasingly large datasets. In this scenario, providing researchers, domain experts, but also businessmen and citizens with visual representations and intuitive interactions can significantly aid the exploration and understanding of the domains and knowledge represented by Linked Data. Various tools and web applications have been developed to enable the navigation, and browsing of the Web of Data. However, these tools lack in producing high level representations for large datasets, and in supporting users in the exploration and querying of these big sources. Following this trend, we devised a new method and a tool called H-BOLD (High level visualizations on Big Open Linked Data). H-BOLD enables the exploratory search and multilevel analysis of Linked Open Data. It offers different levels of abstraction on Big Linked Data. Through the user interaction and the dynamic adaptation of the graph representing the dataset, it will be possible to perform an effective exploration of the dataset, starting from a set of few classes and adding new ones. Performance and portability of H-BOLD have been evaluated on the SPARQL endpoint listed on SPARQL ENDPOINT STATUS. The effectiveness of H-BOLD as a visualization tool is described through a user study.

HTML

XML

PDF

]]>
Research Article Wed, 28 Nov 2018 00:00:00 +0200
Open Domain Targeted Sentiment Classification Using Semi-Supervised Dynamic Generation of Feature Attributes https://lib.jucs.org/article/23705/ JUCS - Journal of Universal Computer Science 24(11): 1582-1603

DOI: 10.3217/jucs-024-11-1582

Authors: Shadi Abudalfa, Moataz Ahmed

Abstract: Microblogging services have been significantly increased nowadays and enabled people to share conveniently their sentiments (opinions) with regard to matters of concerns. Such sentiments have shown an impact on many fields such as economics and politics. Different sentiment analysis approaches have been proposed in the literature to predict automatically sentiments shared in micro-blogs (e.g., tweets). A class of such approaches predicts opinion towards specific target (entity); this class is referred to as target-dependent sentiment classification. Another class, called open domain targeted sentiment classification, extracts targets from the micro-blog and predicts sentiment towards them. In this research work, we propose a new semi-supervised learning technique for developing open domain targeted sentiment classification by using fewer amounts of labelled data. To the best of our knowledge, our model represents the first semi-supervised technique that is proposed for open domain targeted sentiment classification. Additionally, we propose a new supervised learning model for improving accuracy of open domain targeted sentiment classification. Moreover, we show for the first time that SVM HMM is able to improve accuracy of open domain targeted sentiment classification. Experimental results show that our proposed technique outperforms other prominent techniques available in the literature.

HTML

XML

PDF

]]>
Research Article Wed, 28 Nov 2018 00:00:00 +0200
Enhancing Spatial Keyword Preference Query with Linked Open Data https://lib.jucs.org/article/23704/ JUCS - Journal of Universal Computer Science 24(11): 1561-1581

DOI: 10.3217/jucs-024-11-1561

Authors: João Paulo Dias De Almeida, Frederico Durão, Arthur Fortes da Costa

Abstract: This paper presents a Spatial Keyword Preference Query (SKPQ) enhanced by Linked Open Data. This query selects objects based on the textual description of features in their neighborhood. The spatial relationship between objects and features is explored by the SKPQ using a Spatial Inverted Index. In our approach, the spatial relationship is explored using SPARQL. However, the main benefit of using SPARQL is obtained by measuring the textual relevance between features' description and user's keywords. The object description in Linked Open Data is much richer than traditional spatial databases, which leads to a more precise similarity measure than the one employed in the traditional SKPQ. We present an enhanced SKPQ, an algorithm to process this enhanced query, and two experimental evaluations of the proposed algorithm, comparing it with the traditional SKPQ. The first conducted experiment indicate a relative NDCG improvement of the proposed approach over the traditional SKPQ of 20% when using random query keywords. The second experiment shows that using real query keywords, our approach obtained a significant increase in the MAP score.

HTML

XML

PDF

]]>
Research Article Wed, 28 Nov 2018 00:00:00 +0200
EduRP: an Educational Resources Platform based on Opinion Mining and Semantic Web https://lib.jucs.org/article/23699/ JUCS - Journal of Universal Computer Science 24(11): 1515-1535

DOI: 10.3217/jucs-024-11-1515

Authors: Maritza López, Giner Alor-Hernández, José Sánchez-Cervantes, María del Pilar Salas-Zárate, Mario Paredes-Valverde

Abstract: Educational platforms have become important tools for e-learning; nonetheless, finding the appropriate educational resources to use often represents a tedious task for learners. Opinions in the educational domain are important information for decision making; they allow teachers to improve the teaching process and enable students to decide on the best educational resources. The large amount of data that is daily generated on the Web makes it difficult, however, to analyze opinions manually. Multiple opinion mining approaches are being proposed as a solution to this problem; this research work introduces EduRP, an education platform that integrates opinion mining techniques and ontology-based user profiling techniques. We specifically propose an opinion mining approach for Spanish text which consists of three main steps: 1) collect opinions from the EduRP platform, 2) process the opinions to normalize the text, and 3) obtain the polarity of the opinions using a machine learning approach. We also propose a profile customization approach that uses Semantic Web technologies, specifically ontologies, to integrate socio-demographic data from different social networks and from the platform itself. Finally, we assess the performance of our system under precision, recall, and F-measure metrics, obtaining average values of 81.85%, 81.80% and 81.54, respectively.

HTML

XML

PDF

]]>
Research Article Wed, 28 Nov 2018 00:00:00 +0200
Astmapp: A Platform for Asthma Self-Management https://lib.jucs.org/article/23695/ JUCS - Journal of Universal Computer Science 24(11): 1496-1514

DOI: 10.3217/jucs-024-11-1496

Authors: Harry Luna-Aveiga, José Medina-Moreira, Oscar Apolinario-Arzube, Mario Paredes-Valverde, Katty Lagos-Ortiz, Rafael Valencia-García

Abstract: Asthma is a chronic lung disease of the airways that makes breathing difficult. Worldwide, asthma is a leading disease among children and adolescents and a leading cause of hospitalizations among adolescents. Asthma self-management is a systematic procedure that allows educating, training, and informing patients to control their disease and avoid it when it is possible and reduce it when it is necessary. Nowadays, there is a need for technological tools for supporting different tasks within the process of asthma self-management, such as education, control, and monitoring, that help patients and their families improve their quality of life and reduce the direct and indirect costs. This work proposes Astmapp, a platform that relies on semantic and mobile technologies and recommender systems to increase the patients' knowledge about asthma regarding topics such as triggers, symptoms, activity restrictions, medications, among others, and to promote the asthma control by means of the monitoring of symptoms and parameters such as physical activity, heart rate, blood pressure, temperature, among others. Likewise, Astmapp recommends educational resources based on the preferences of patients and generates medical recommendations based on the symptoms and health status of the patient aiming to prevent asthma and reduce its exacerbation. Astmapp was evaluated in terms of its ability to recommend asthma educational resources relevant for the patients as well as to provide health recommendations. The evaluation results suggest that Astmapp has the potential to effectively support the asthma self-management process.

HTML

XML

PDF

]]>
Research Article Wed, 28 Nov 2018 00:00:00 +0200
Building an Educational Platform Using NLP: A Case Study in Teaching Finance https://lib.jucs.org/article/23607/ JUCS - Journal of Universal Computer Science 24(10): 1403-1423

DOI: 10.3217/jucs-024-10-1403

Authors: Soto Montalvo, Jesus Palomo, Carmen Orden

Abstract: Information overload is one of the main challenges in the current educational context, where the Internet has become a major source of information. According to the European Space for Higher Education, students must now be more autonomous and creative, with lecturers being required to provide guidance and supervision. Guiding students to search and read news related to subjects that are being studied in class has proven to be an effective technique in improving motivation, because students appreciate the relevance of the topics being studied in real world examples. However, one of the main drawbacks of this teaching practice is the amount of time that lecturers and students need for searching relevant and useful information on different subjects. The objective of our research is to demonstrate the usefulness of a complementary teaching tool in the traditional educational classroom. It is a new educational platform that combines Artificial Intelligence techniques with the expertise provided by lecturers. It automatically compiles information from different sources and presents only relevant breaking news classified into different subjects and topics. It has been tested on a Finance course, where being continually informed about the latest economic and financial news is an important part of the teaching process, specially for certain key financial concepts. The utility of the platform has been studied by conducting student surveys. The results confirm that using the platform had a positive impact on improving students' motivation and boost the learning processes. This research provides evidence about effectiveness of the new educational complement to traditional teaching methods in classrooms. Also, it demonstrates the improvement on the knowledge transfer within an environment of information overload.

HTML

XML

PDF

]]>
Research Article Sun, 28 Oct 2018 00:00:00 +0300
Modelling of Automotive Engine Dynamics using Diagonal Recurrent Neural Network https://lib.jucs.org/article/23542/ JUCS - Journal of Universal Computer Science 24(9): 1330-1342

DOI: 10.3217/jucs-024-09-1330

Authors: Yujia Zhai, Kejun Qian, Fei Xue, Moncef Tayahi

Abstract: The spark-ignition (SI) engine dynamics is described as a severely nonlinear and fast process. A black-box model obtained by system identification approach is often valuable for the control and fault diagnosis application on such systems. Recurrent neural network (RNN) might be better suited for such dynamical system modelling due to its feedback back scheme if compared with feed-forward neural network. However, the computational load for RNN limits its practical application. In this paper, a diagonal recurrent neural network (DRNN) is investigated to model SI engine dynamics to achieve a balance between the modelling performance and computational burden. The data collection procedure and algorithms for training DRNN are presented too. Satisfactory results on modelling have been obtained with moderate cost on computation.

HTML

XML

PDF

]]>
Research Article Fri, 28 Sep 2018 00:00:00 +0300
Cloud Biometric Authentication: An Integrated Reliability and Security Method Using the Reinforcement Learning Algorithm and Queue Theory https://lib.jucs.org/article/23145/ JUCS - Journal of Universal Computer Science 24(4): 372-391

DOI: 10.3217/jucs-024-04-0372

Authors: A M N Balla Husamelddin, Guang Chen, Weipeng Jing

Abstract: While cloud systems deliver a larger amount of computing power, they do not guarantee full security and reliability. Focusing on improving successful job execution under resource constraints and security problems, this work proposes an enhanced, effective, integrated and novel approach to security and reliability. To apply a high level of security in the system, our novel approach uses cloud biometric authentication by splitting the biometric data into small chunks and spreading it over the cloud's resources. Reliability is enhanced through successful job execution by employing an adaptive reinforcement learning (RL) algorithm combined with a queuing theory. Our approach supports task schedulers to effectively adapt to dynamic changes in cloud environments. Based on the idea of reliability, we developed an adaptive action-selection, which controls the action selection dynamically by considering queue buffer size and the uncertainty value function. We evaluated the performance of our approach by several experiments conducted in terms of successful task execution and utilization rate and then compared our approach with other job scheduling policies. The experimental results demonstrated the efficiency of our method and achieved the objectives of the proposed system.

HTML

XML

PDF

]]>
Research Article Sat, 28 Apr 2018 00:00:00 +0300
Does the Users' Tendency to Seek Information Affect Recommender Systems' Performance? https://lib.jucs.org/article/22983/ JUCS - Journal of Universal Computer Science 23(2): 187-207

DOI: 10.3217/jucs-023-02-0187

Authors: Umberto Panniello, Lorenzo Ardito, Antonio Petruzzelli

Abstract: Much work has been done on developing recommender system (RS) algorithms, on comparing them using business metrics (such as customers' trust or perception of recommendations' novelty) and on exploring users' reactions to recommendations. It was demonstrated that different recommender systems perform differently on several performance metrics and that different users react differently to the same kind of recommendations. As a consequence, some scholars challenged to explore how users with different tendency to seek information during their purchasing process may react to different kind of recommendations. To the best of our knowledge, none of the prior works studied if users' tendency to seek information has an effect on recommender systems' performance. Different users may traditionally have different propensity to seek information and to receive suggestions and therefore they may react differently to the same recommendations. To this aim, we performed a live experiment with real customers coming from a European firm.

HTML

XML

PDF

]]>
Research Article Mon, 28 Aug 2017 00:00:00 +0300
An Anesthesia Alert System based on Dynamic Profiles Inferred through the Medical History of Patients https://lib.jucs.org/article/23435/ JUCS - Journal of Universal Computer Science 23(8): 705-724

DOI: 10.3217/jucs-023-08-0705

Authors: Jorge Luis Victória Barbosa, Bruno Sempe, Bruno Mota, Leandro Dini

Abstract: Anesthesia Information Management Systems (AIMSs) have existed for many decades. However, how to turn patient records into strategic information to improve the anesthesia process is still a research challenge. We did not find systems that use data from previous procedures for issuing alerts. This data can prevent errors during procedures and aid on medical staff evaluation. We propose SaneWatch, an alert system guided by the medical history of patients. SaneWatch uses configurable rules to continuously review the patient's history and automatically generate an anesthesia profile. This dynamic profile allows the emission of strategic alerts during the anesthesia procedures. We have implemented and integrated the system in an AIMS that has been used the past four years by more than 40 anesthesiologists in several hospitals in the city of Porto Alegre in southern Brazil. We applied the integrated system in a practical experiment. Twenty doctors tried it and filled out a questionnaire based on the Technology Acceptance Model (TAM). An overall strong agreement of 96% was obtained in perceived usefulness acceptance assessment. In addition, 86% of users indicated that the system was easy to use. The results were encouraging and demonstrate the potential for implementing SaneWatch in anesthesia procedures. However, 12% of doctors disagreed with regard to ease of use, showing that the system needs improvements in interface related aspects.

HTML

XML

PDF

]]>
Research Article Mon, 28 Aug 2017 00:00:00 +0300
On Predicting Election Results using Twitter and Linked Open Data: The Case of the UK 2010 Election https://lib.jucs.org/article/23060/ JUCS - Journal of Universal Computer Science 23(3): 280-303

DOI: 10.3217/jucs-023-03-0280

Authors: Evangelos Kalampokis, Areti Karamanou, Efthimios Tambouris, Konstantinos Tarabanis

Abstract: The analysis of Social Media data enables eliciting public behaviour and opinion. In this context, a number of studies have recently explored Social Media's capability to predict the outcome of real-world phenomena. The results of these studies are controversial with elections being the most disputable phenomenon. The objective of this paper is to present a case of predicting the results of the UK 2010 through Twitter. In particular, we study to what extend it is possible to use Twitter data to accurately predict the percentage of votes of the three most prominent political parties namely the Conservative Party, Liberal Democrats, and the Labour Party. The approach we follow capitalises on (a) a theoretical Social Media data analysis framework for predictions and (b) Linked Open Data to enrich Twitter data. We extensively discuss each step of the framework to emphasise on the details that could affect the prediction accuracy.We anticipate that this paper will contribute to the ongoing discussion of understanding to what extend and under which circumstances election results are predictable through Social Media.

HTML

XML

PDF

]]>
Research Article Tue, 28 Mar 2017 00:00:00 +0300
A Method for Privacy-preserving Collaborative Filtering Recommendations https://lib.jucs.org/article/22977/ JUCS - Journal of Universal Computer Science 23(2): 146-166

DOI: 10.3217/jucs-023-02-0146

Authors: Christos Georgiadis, Nikolaos Polatidis, Haralambos Mouratidis, Elias Pimenidis

Abstract: With the continuous growth of the Internet and the progress of electronic commerce the issues of product recommendation and privacy protection are becoming increasingly important. Recommender Systems aim to solve the information overload problem by providing accurate recommendations of items to users. Collaborative filtering is considered the most widely used recommendation method for providing recommendations of items or users to other users in online environments. Additionally, collaborative filtering methods can be used with a trust network, thus delivering to the user recommendations from both a database of ratings and from users who the person who made the request knows and trusts. On the other hand, the users are having privacy concerns and are not willing to submit the required information (e.g., ratings for products), thus making the recommender system unusable. In this paper, we propose (a) an approach to product recommendation that is based on collaborative filtering and uses a combination of a ratings network with a trust network of the user to provide recommendations and (b) 'neighbourhood privacy' that employs a modified privacy-aware role-based access control model that can be applied to databases that utilize recommender systems. Our proposed approach (1) protects user privacy with a small decrease in the accuracy of the recommendations and (2) uses information from the trust network to increase the accuracy of the recommendations, while, (3) providing privacy-preserving recommendations, as accurate as the recommendations provided without the privacy-preserving approach or the method that increased the accuracy applied.

HTML

XML

PDF

]]>
Research Article Tue, 28 Feb 2017 00:00:00 +0200
Exploring Teachers' Perceptions on Modeling Effort Demanded by CSCL Designs with Explicit Artifact Flow Support https://lib.jucs.org/article/23595/ JUCS - Journal of Universal Computer Science 22(10): 1398-1417

DOI: 10.3217/jucs-022-10-1398

Authors: Osmel Bordies, Yannis Dimitriadis

Abstract: Artifact flow represents an important aspect of teaching/learning processes, especially in CSCL situations in which complex relationships may be found. However, explicit modeling of CSCL processes with artifact flow may increase the cognitive load and associated effort of the teachers-designers and therefore decrease the efficiency of the design process. The empirical study, reported in this paper and grounded on mixed methods, provides evidence of the effort overload when teachers are involved in designing CSCL situations in a controlled environment. The results of the study illustrate the problem through the subjective perception of the participating teachers, complemented with objective parameters, such as time consumed, errors committed, uncertainty and objective complexity metrics.

HTML

XML

PDF

]]>
Research Article Sat, 1 Oct 2016 00:00:00 +0300
Boosting Point-of-Interest Recommendation with Multigranular Time Representations https://lib.jucs.org/article/23430/ JUCS - Journal of Universal Computer Science 22(8): 1148-1174

DOI: 10.3217/jucs-022-08-1148

Authors: Gonzalo Rojas, Diego Seco, Francisco Serrano

Abstract: Technologies of recommender systems are being increasingly adopted by Location Based Social Networks (LBSNs) with the purpose of recommending Pointsof-Interest (POIs) to their users, and different contextual characteristics have been incorporated to enhance this process. Among these characteristics, the time at which users express their preferences (typically, by checking-in to different POIs) and ask for recommendations, is frequently referred as a first-order feature in this process. However, even when its influence on improving the accuracy of recommendations has been empirically demonstrated, time is still mainly considered through a monogranular representation (one-hour or one-day blocks). In this article, we introduce a POI recommendation approach based on a multigranular characterization of time, composed of hour, day-of-the-week, and month. Based on this concept, we propose two representations of user check-ins: one that directly extends a monogranular proposal of time for POI recommendations, and other based on a statistical representation of check-in distributions in time. For both representations, corresponding algorithms to compute user similarity and preference prediction are introduced. The experimental evaluation shows promising results in terms of accuracy and scalability.

HTML

XML

PDF

]]>
Research Article Mon, 1 Aug 2016 00:00:00 +0300
A Semantic Filtering Mechanism Geared Towards Context Dissemination in Ubiquitous Environments https://lib.jucs.org/article/23429/ JUCS - Journal of Universal Computer Science 22(8): 1123-1147

DOI: 10.3217/jucs-022-08-1123

Authors: Guilherme Melo e Maranhão, Renato Bulcão-Neto

Abstract: A great challenge in context-aware computing is dealing with the heterogeneity and volume of sensors data. A problem regarding that scenario is to notify context-aware applications, which have distinct interests of context events in terms of volume, semantic and complexity, in an efficient and relevant manner. Aiming to solve this problem, this research focuses on a new approach for filtering semantic context towards supporting context dissemination. This mechanism is to be aligned with the reasoning capabilities of a context-aware solution and also be maintainable and extensible to efficiently support changes in an ontological model. A performance evaluation is carried out in a simulated scenario of vital signs monitoring in Intensive Care Units and wards. Hermes Interpreter's behaviour is analysed when dealing with filters of different complexities and also an increasing number of subscribers per vital sign. Results demonstrate the high cost of the semantic filtering mechanism in comparison with pure context reasoning activities.

HTML

XML

PDF

]]>
Research Article Mon, 1 Aug 2016 00:00:00 +0300
Applying Brand Equity Theory to Understand Consumer Opinion in Social Media https://lib.jucs.org/article/23210/ JUCS - Journal of Universal Computer Science 22(5): 709-734

DOI: 10.3217/jucs-022-05-0709

Authors: Evangelos Kalampokis, Areti Karamanou, Efthimios Tambouris, Konstantinos Tarabanis

Abstract: Billions of people everyday use Social Media (SM), such as Facebook and Twitter, to express their opinions and experiences with brands. Companies are highly interested in understanding such SM brand-related content. Consequently, many studies have been conducted and many applications have been developed to analyse this content. For analysis purposes, the main SM metrics used include volume and sentiment. Interestingly, however, brand equity theory proposes different metrics for assessing brand reputation. These include brand image, brand satisfaction and purchase intention (henceforth referred to as marketing metrics). The objective of this paper is to explore the feasibility of applying marketing metrics in Twitter brand-related content. For this purpose, we collect, study and analyse tweets that mention two brands, namely IKEA and Gatorade. The manual analysis suggests that a significant amount of brand tweets is related to brand image, brand satisfaction and purchase intention. We thereafter design an algorithm that classifies tweets into relevant categories to enable automatic marketing metrics computation. We implement the algorithm using statistical learning approaches and prove that its classification accuracy is good. We anticipate that this article will motivate other studies as well as applications' designers in adopting marketing theories when evaluating brand reputation through SM content.

HTML

XML

PDF

]]>
Research Article Sun, 1 May 2016 00:00:00 +0300
Sentiment Classification of Spanish Reviews: An Approach based on Feature Selection and Machine Learning Methods https://lib.jucs.org/article/23209/ JUCS - Journal of Universal Computer Science 22(5): 691-708

DOI: 10.3217/jucs-022-05-0691

Authors: Mario Paredes-Valverde, Jorge Limon-Romero, Diego Tlapa, Yolanda Baez-Lopez

Abstract: Sentiment analysis aims to extract users' opinions from review documents. Nowadays, there are two main approaches for sentiment analysis: the semantic orientation and the machine learning. Sentiment analysis approaches based on Machine Learning (ML) methods work over a set of features extracted from the users' opinions. However, the high dimensionality of the feature vector reduces the effectiveness of this approach. In this sense, we propose a sentiment classification method based on feature selection mechanisms and ML methods. The present method uses a hybrid feature extraction method based on POS pattern and dependency parsing. The features obtained are enriched semantically through common-sense knowledge bases. Then, a feature selection method is applied to eliminate the noisy and irrelevant features. Finally, a set of classifiers is trained in order to classify unknown data. To prove the effectiveness of our approach, we have conducted an evaluation in the movies and technological products domains. Also, our proposal was compared with well-known methods and algorithms used on the sentiment classification field. Our proposal obtained encouraging results based on the F-measure metric, ranging from 0.786 to 0.898 for the aforementioned domains.

HTML

XML

PDF

]]>
Research Article Sun, 1 May 2016 00:00:00 +0300
Feature Based Sentiment Analysis for Service Reviews https://lib.jucs.org/article/23207/ JUCS - Journal of Universal Computer Science 22(5): 650-670

DOI: 10.3217/jucs-022-05-0650

Authors: Ariyur Abirami, Abdulkhader Askarunisa

Abstract: Sentiment Analysis deals with the analysis of emotions, opinions and facts in the sentences which are expressed by the people. It allows us to track attitudes and feelings of the people by analyzing blogs, comments, reviews and tweets about all the aspects. The development of Internet has strong influence in all types of industries like tourism, healthcare and any business. The availability of Internet has changed the way of accessing the information and sharing their experience among users. Social media provide this information and these comments are trusted by other users. This paper recognizes the use and impact of social media on healthcare industry by analyzing the users' feelings expressed in the form of free text, thereby gives the quality indicators of services or features related with them. In this paper, a sentiment classifier model using improved Term Frequency Inverse Document Frequency (TF-IDF) method and linear regression model has been proposed to classify online reviews, tweets or customer feedback for various features. The model involves the process of gathering online user reviews about hospitals and 'analyzes' those reviews in terms of sentiments expressed. Information Extraction process filters irrelevant reviews, extracts sentimental words of features identified and quantifies the sentiment of features using sentiment dictionary. Emotionally expressed positive or negative words are assigned weights using the classification prescribed in the dictionary. The sentiment analysis on tweets/reviews is done for various features using Natural Language Processing (NLP) and Information Retrieval (IR) techniques. The proposed linear regression model using the senti-score predicts the star rating of the feature of service. The statistical results show that improved TF-IDF method gives better accuracy when compared with TF and TF-IDF methods, used for representing the text. The senti-score obtained as a result of text analysis (user feedback) on features gives not only the opinion summarization but also the comparative results on various features of different competitors. This information can be used by business to focus on the low scored features so as to improve their business and ensure a very high level of user satisfaction.

HTML

XML

PDF

]]>
Research Article Sun, 1 May 2016 00:00:00 +0300
Opinion Retrieval for Twitter Using Extrinsic Information https://lib.jucs.org/article/23205/ JUCS - Journal of Universal Computer Science 22(5): 608-629

DOI: 10.3217/jucs-022-05-0608

Authors: Yoon-Sung Kim, Young-In Song, Hae-Chang Rim

Abstract: Opinion retrieval in social networks is a very useful field for industry because it can provide a facility for monitoring opinions about a product, person or issue in real time. An opinion retrieval system generally retrieves topically relevant and subjective documents based on topical relevance and a degree of subjectivity. Previous studies on opinion retrieval only considered the intrinsic features of original tweet documents and thus suffer from the data sparseness problem. In this paper, we propose a method of utilizing the extrinsic information of the original tweet and solving the data sparseness problem. We have found useful extrinsic features of related tweets, which can properly measure the degree of subjectivity of the original tweet. When we performed an opinion retrieval experiment including proposed extrinsic features within a learning-to-rank framework, the proposed model significantly outperformed both the baseline system and the state-of-the-art opinion retrieval system in terms of Mean Average Precision (MAP) and Precision@K (P@K) metrics.

HTML

XML

PDF

]]>
Research Article Sun, 1 May 2016 00:00:00 +0300
A Novel Similar Temporal System Call Pattern Mining for Efficient Intrusion Detection https://lib.jucs.org/article/23120/ JUCS - Journal of Universal Computer Science 22(4): 475-493

DOI: 10.3217/jucs-022-04-0475

Authors: Vangipuram Radhakrishna, Puligadda Kumar, Vinjamuri Janaki

Abstract: Software security pattern mining is the recent research interest among researchers working in the areas of security and data mining. When an application runs, several process and system calls associated are invoked in background. In this paper, the major objective is to identify the intrusion using temporal pattern mining. The idea is to find normal temporal system call patterns and use these patterns to identify abnormal temporal system call patterns. For finding normal system call patterns, we use the concept of temporal association patterns. The reference sequence is used to obtain temporal association system call patterns satisfying specified dissimilarity threshold. To find similar (normal) temporal system call patterns, we apply our novel method which performs only a single database scan, reducing unnecessary extra overhead incurred when multiple scans are performed thus achieving space and time efficiency. The importance of the approach coins from the fact that this is first single database scan approach in the literature. To find if a given process is normal or abnormal, it is just sufficient to verify if there exists a temporal system call pattern which is not similar to the reference system call support sequence for specified threshold. This eliminates the need for finding decision rules by constructing decision table. The approach is efficient as it eliminates the need for finding decision rules (2n is usually very large for even small value of n) and thus aims at efficient dimensionality reduction as we consider only similar temporal system call sequence for deciding on intrusion.

HTML

XML

PDF

]]>
Research Article Fri, 1 Apr 2016 00:00:00 +0300
A Domain Ontology in Social Networks for Identifying User Interest for Personalized Recommendations https://lib.jucs.org/article/23050/ JUCS - Journal of Universal Computer Science 22(3): 319-339

DOI: 10.3217/jucs-022-03-0319

Authors: Rung-Ching Chen, Hendry, Chung-Yi Huang

Abstract: Social media and the development of web 2.0 encourage the user to participate more interactively in social networks. In social network relationships may be identified by the user posts and interactions. Using this data, the system can make recommendations tailored to specific users. However, when the user is on social network for the first time, the recommendation system cannot make recommendations, since the user has no history. In this paper, we design an ontology combined with social networks. We develop the ontology based on data from users and their friends. Using the user interest and community influences, we propose a system to solve the cold start problem in recommendation systems. The system calculates the similarity between users, using user preferences and uses a rule generating algorithm to create the dynamic inference rule. The ontology is updated each time the content of the personal ontology is updated. The newest ontology will be retained to increase the accuracy the next time the recommendation system is executed.

HTML

XML

PDF

]]>
Research Article Tue, 1 Mar 2016 00:00:00 +0200
Evaluating the Relative Performance of Collaborative Filtering Recommender Systems https://lib.jucs.org/article/23834/ JUCS - Journal of Universal Computer Science 21(13): 1849-1868

DOI: 10.3217/jucs-021-13-1849

Authors: Humberto Jesús Corona Pampín, Houssem Jerbi, Michael P. O Mahony

Abstract: Past work on the evaluation of recommender systems indicates that collaborative filtering algorithms are accurate and suitable for the top-N recommendation task. Further, the importance of performance beyond accuracy has been recognised in the literature. Here, we present an evaluation framework based on a set of accuracy and beyond accuracy metrics, including a novel metric that captures the uniqueness of a recommendation list. We perform an in-depth evaluation of three well-known collaborative filtering algorithms using three datasets. The results show that the user-based and item-based collaborative filtering algorithms have a high inverse correlation between popularity and diversity and recommend a common set of items at large neighbourhood sizes. The study also finds that the matrix factorisation approach leads to more accurate and diverse recommendations, while being less biased toward popularity.

HTML

XML

PDF

]]>
Research Article Mon, 28 Dec 2015 00:00:00 +0200
Content-based Information Retrieval by Named Entity Recognition and Verb Semantic Role Labelling https://lib.jucs.org/article/23832/ JUCS - Journal of Universal Computer Science 21(13): 1830-1848

DOI: 10.3217/jucs-021-13-1830

Authors: Betina J, G. Mahalakshmi

Abstract: Tamil Siddha medicine, an ancient medicinal system has yielded us a wide range of untapped information about traditional medicines. In this paper, we explore into the various Natural Language Processing techniques that can be implemented to this syntactically rich corpus. As domain information mostly concentrates on the central concepts, we start our work by identifying the Named Entities and categorizing them. An integrated NER classifier is built which comprises of SVM and Decision Tree classifier with an accuracy as high as 95%. These entities play different roles in different context. Hence their roles are labelled along with the predicates surrounding them. These roles and predicates give rise to a rule based sentence tagging system, trained by an MEM model, to tag different contents in this otherwise unstructured text. These two important techniques are then exploited to develop our Information Retrieval System that combines the methods category tagging done by Named Entity Recognition and content tagging done by Semantic Role Labelling. The system takes full advantage of the rich features of the language and hence can be expanded to other domains.

HTML

XML

PDF

]]>
Research Article Mon, 28 Dec 2015 00:00:00 +0200
A Distributed Recommendation Platform for Big Data https://lib.jucs.org/article/23830/ JUCS - Journal of Universal Computer Science 21(13): 1810-1829

DOI: 10.3217/jucs-021-13-1810

Authors: Daniel Valcarce, Javier Parapar, Álvaro Barreiro

Abstract: The vast amount of information that recommenders manage these days has reached a point where scalability has become a critical factor. In this work, we propose a scalable architecture designed for computing Collaborative Filtering recommendations in a Big Data scenario. In order to build a highly scalable and fault-tolerant platform, we employ fully distributed systems without any single point of failure. We study the use of data replication and data distribution technologies. Additionally, we consider different caching techniques. Taking into account these requirements, we propose particular technologies for each component of the platform. Next, we evaluate the response times of storing, generating and serving recommendations using MySQL Cluster and Cassandra showing that the latter technology is much more adequate for that purpose. Finally, we conduct a simulation for evaluating the impact of a memory caching system.

HTML

XML

PDF

]]>
Research Article Mon, 28 Dec 2015 00:00:00 +0200
Queuing Theory-based Latency/Power Tradeoff Models for Replicated Search Engines https://lib.jucs.org/article/23829/ JUCS - Journal of Universal Computer Science 21(13): 1790-1809

DOI: 10.3217/jucs-021-13-1790

Authors: Ana Freire, Craig Macdonald, Nicola Tonellotto, Iadh Ounis, Fidel Cacheda

Abstract: Large-scale search engines are built upon huge infrastructures involvingthousands of computers in order to achieve fast response times. In contrast, the energy consumed (and hence the financial cost) is also high, leading to environmental damage. This paper proposes new approaches to increase energy and financial savings in large-scale search engines, while maintaining good query response times. We aim to improve current state-of-the-art models used for balancing power and latency, by integratingnew advanced features. On one hand, we propose to improve the power savings by completely powering down the query servers that are not necessary when the load ofthe system is low. Besides, we consider energy rates into the model formulation. On the other hand, we focus on how to accurately estimate the latency of the whole systemby means of Queueing Theory. Experiments using actual query logs attest the high energy (and financial) savingsregarding current baselines. To the best of our knowledge, this is the first paper in successfully applying stationary Queueing Theory models to estimate the latency in alarge-scale search engine.

HTML

XML

PDF

]]>
Research Article Mon, 28 Dec 2015 00:00:00 +0200
Statistical Analysis to Establish the Importance of Information Retrieval Parameters https://lib.jucs.org/article/23827/ JUCS - Journal of Universal Computer Science 21(13): 1767-1789

DOI: 10.3217/jucs-021-13-1767

Authors: Julie Ayter, Adrian-Gabriel Chifu, Sébastien Déjean, Cecile Desclaux, Josiane Mothe

Abstract: Search engines are based on models to index documents, match queries and documents and rank documents. Research in Information Retrieval (IR) aims at defining these models and their parameters in order to optimize the results. Using benchmark collections, it has been shown that there is not a best system configuration that works for any query, but rather that performance varies from one query to another. It would be interesting if a meta-system could decide which system configuration should process a new query by learning from the context of previousqueries. This paper reports a deep analysis considering more than 80,000 search engine configurations applied to 100 queries and the corresponding performance. The goal of the analysis is to identify which configuration responds best to a certain type of query. We considered two approaches to define query types: one is post-evaluation, based on query clustering according to the performance measured with Average Precision, while the second approach is pre-evaluation, using query features (including query difficulty predictors) to cluster queries. Globally, we identified two parameters that should be optimized: retrieving_model and TrecQueryTags_process. One could expect such results as these two parameters are major components of IR process. However our work results in two main conclusions: 1/ based on post-evaluation approach, we found that retrieving_model is the most influential parameter for easy queries while TrecQueryTags process is for hard queries; 2/ for pre-evaluation, current query features do not allow to cluster queries to identify differences in the influential parameters.

HTML

XML

PDF

]]>
Research Article Mon, 28 Dec 2015 00:00:00 +0200
An Interactive Design Pattern Selection Method https://lib.jucs.org/article/23826/ JUCS - Journal of Universal Computer Science 21(13): 1746-1766

DOI: 10.3217/jucs-021-13-1746

Authors: Nadia Bouassida, Salma Jamoussi, Ahmed Msaed, Hanêne Ben-Abdallah

Abstract: Any inexperienced designer may not take advantage of design patterns due to their high level of abstraction, on the one hand, and their overwhelming number, on the other hand. In this paper, we propose a new approach that first retrieves and recommends a design pattern that is adequate to a designer's modeling context, it then helps them in its instantiation. Our approach learns past pattern reuse cases and it interacts with the designer through a questionnaire to ensure that the retrieved pattern corresponds to their needs and intentions. It uses the text mining technique Principal Component Analysis on past experiences of design pattern reuses; the choice of this technique was based on an experimental evaluation we conducted to determine the most adequate text representation and mining technique for our problem. In a final assistance step, after retrieving the most appropriate design pattern, our approach transforms the design situation at hand into the pattern constituting the solution.

HTML

XML

PDF

]]>
Research Article Mon, 28 Dec 2015 00:00:00 +0200
Learning to Choose the Best System Configuration in Information Retrieval: the Case of Repeated Queries https://lib.jucs.org/article/23825/ JUCS - Journal of Universal Computer Science 21(13): 1726-1745

DOI: 10.3217/jucs-021-13-1726

Authors: Anthony Bigot, Sébastien Déjean, Josiane Mothe

Abstract: This paper presents a method that automatically decides which system configuration should be used to process a query. This method is developed for the case of repeated queries and implements a new kind of meta-system. It is based on a training process: the meta-system learns the best system configuration to use on a per query basis. After training, the meta-search system knows which configuration should treat a given query. The Learning to Choose method we developed selects the best configurations among many. This selective process rests on data analytics applied to system parameter values and their link with system effectiveness. Moreover, we optimize the parameters on a per-query basis. The training phase uses a limited amount of document relevance judgment. When the query is repeated or when an equal-query is submitted to the system, the meta-system automatically knows which parameters it should use to treat the query. This method fits the case of changing collections since what is learned is the relationship between a query and the best parameters to use to process it, rather than the relationship between a query and documents to retrieve. In this paper, we describe how data analysis can help to select among various configurations the ones that will be useful. The "Learning to choose" method is presented and evaluated using simulated data from TREC campaigns. We show that system performance highly increases in terms of precision, specifically for the queries that are difficult or medium difficult to answer. The other parameters of the method are also studied.

HTML

XML

PDF

]]>
Research Article Mon, 28 Dec 2015 00:00:00 +0200
Cross-Language Source Code Re-Use Detection Using Latent Semantic Analysis https://lib.jucs.org/article/23824/ JUCS - Journal of Universal Computer Science 21(13): 1708-1725

DOI: 10.3217/jucs-021-13-1708

Authors: Enrique Flores, Alberto Barrón-Cedeño, Lidia Moreno, Paolo Rosso

Abstract: Nowadays, Internet is the main source to get information from blogs, encyclopedias, discussion forums, source code repositories, and more resources which are available just one click away. The temptation to re-use these materials is very high. Even source codes are easily available through a simple search on the Web. There is a need of detecting potential instances of source code re-use. Source code re-use detection has usually been approached comparing source codes in their compiled version. When dealing with cross-language source code re-use, traditional approaches can deal only with the programming languages supported by the compiler. We assume that a source code is a piece of text ,with its syntax and structure, so we aim at applying models for free text re-use detection to source code. In this paper we compare a Latent Semantic Analysis (LSA) approach with previously used text re-use detection models for measuring cross-language similarity in source code. The LSA-based approach shows slightly better results than the other models, being able to distinguish between re-used and related source codes with a high performance.

HTML

XML

PDF

]]>
Research Article Mon, 28 Dec 2015 00:00:00 +0200
A Framework for Extraction of Relations from Text using Relational Learning and Similarity Measures https://lib.jucs.org/article/23659/ JUCS - Journal of Universal Computer Science 21(11): 1482-1495

DOI: 10.3217/jucs-021-11-1482

Authors: Maria Vargas-Vera

Abstract: Named entity recognition (NER) has been studied largely in the Information Extraction community as it is one step in the construction of an Information Extraction System. However, to extract only names without contextual information is not sufficient if we want to be able to describe facts encountered in documents, in particular, academic documents. Then, there is a need for extracting relations between entities. This task is accomplished using relational learning algorithms embedded in an Information Extraction framework. In particular, we have extended two relational learning frameworks RAPIER and FOIL. Our proposed extended frameworks are equipped with DSSim (short for Dempster-Shafer Similarity) our similarity service. Both extended frameworks were tested using an electronic newsletter consisting of news articles describing activities or events happening in an academic institution as our main application is on education.

HTML

XML

PDF

]]>
Research Article Sun, 1 Nov 2015 00:00:00 +0200
Design and Implementation of an Extended Corporate CRMDatabase System with Big Data Analytical Functionalities https://lib.jucs.org/article/23257/ JUCS - Journal of Universal Computer Science 21(6): 757-776

DOI: 10.3217/jucs-021-06-0757

Authors: Ana Torre-Bastida, Esther Villar-Rodriguez, Sergio Gil-Lopez, Javier Ser

Abstract: The amount of open information available on-line from heterogeneous sources anddomains is growing at an extremely fast pace, and constitutes an important knowledge base for the consideration of industries and companies. In this context, two relevant data providers can behighlighted: the "Linked Open Data" (LOD) and "Social Media" (SM) paradigms. The fusion of these data sources - structured the former, and raw data the latter -, along with the informationcontained in structured corporate databases within the organizations themselves, may unveil significant business opportunities and competitive advantage to those who are able to understand andleverage their value. In this paper, we present two complementary use cases, illustrating the potential of using the open data in the business domain. The first represents the creation of an existingand potential customer knowledge base, exploiting social and linked open data based on which any given organization might infer valuable information as a support for decision making. Thesecond focuses on the classification of organizations and enterprises aiming at detecting potential competitors and/or allies via the analysis of the conceptual similarity between their participatedprojects. To this end, a solution based on the synergy of Big Data and semantic technologies will be designed and developed. The first will be used to implement the tasks of collection, data fusionand classification supported by natural language processing (NLP) techniques, whereas the latter will deal with semantic aggregation, persistence, reasoning and information retrieval, as well aswith the triggering of alerts based on the semantized information.

HTML

XML

PDF

]]>
Research Article Mon, 1 Jun 2015 00:00:00 +0300
Leveraging Hybrid Recommenders with Multifaceted Implicit Feedback https://lib.jucs.org/article/22959/ JUCS - Journal of Universal Computer Science 21(2): 223-247

DOI: 10.3217/jucs-021-02-0223

Authors: Marcelo Manzato, Edson B. Santos Junior, Rudinei Goularte

Abstract: Research into recommender systems has focused on the importance of considering a variety of users' inputs for an efficient capture of their main interests. However, most collaborative filtering efforts are related to latent factors and implicit feeback, which do not consider the metadata associated with both items and users. This article proposes a hybrid recommender model which exploits implicit feedback from users by considering not only the latent space of factors that describes the user and item, but also the available metadata associated with content and individuals. Such descriptions are an important source for the construction of a user's profile that contains relevant and meaningful information about his/her preferences. The proposed model is generic enough to be used with many descriptions and types and characterizes users and items with distinguished features that are part of the whole recommendation process. The model was evaluated with the well-known MovieLens dataset and its composing modules were compared against other approaches reported in the literature. The results show its effectiveness in terms of prediction accuracy.

HTML

XML

PDF

]]>
Research Article Sun, 1 Feb 2015 00:00:00 +0200
A Utility-Oriented Routing Scheme for Interest-Driven Community-Based Opportunistic Networks https://lib.jucs.org/article/23821/ JUCS - Journal of Universal Computer Science 20(13): 1829-1854

DOI: 10.3217/jucs-020-13-1829

Authors: Xiuwen Fu, Wenfeng Li, Giancarlo Fortino, Pasquale Pace, Gianluca Aloi, Wilma Russo

Abstract: Opportunistic networks, as representative networks evolved from social networks and Ad-hoc networks, have been on cutting edges in recent years. Many research efforts have focused on realistic mobility models and cost-effective routing schemes. The concept of "community", as one of the most inherent attributes of opportunistic networks, has been proved to be very helpful in simulating mobility traces of human society and selecting suitable message forwarders. This paper proposes an interest-driven community-based mobility model by considering location preference and time variance in human behavior patterns. Based on this enhanced mobility model, a novel two-layer routing algorithm, named InterCom, is presented by jointly considering utilities generated by users' activity degree and social relationships. The results, obtained throughout an intensive simulation analysis, show that the proposed routing scheme is able to improve delivery ratio while keeping the routing overhead and transmission delay within a reasonable range with respect to well-known routing schemes for opportunistic networks.

HTML

XML

PDF

]]>
Research Article Fri, 28 Nov 2014 00:00:00 +0200
Developing Distributed Collaborative Applications with HTML5 under the Coupled Objects Paradigm https://lib.jucs.org/article/23816/ JUCS - Journal of Universal Computer Science 20(13): 1712-1737

DOI: 10.3217/jucs-020-13-1712

Authors: Nelson Baloian, Diego Aguirre, Gustavo Zurita

Abstract: One of the main tasks in developing distributed collaborative systems is to support synchronization processes. The Coupled Objects paradigm has emerged as a way to easily support these processes by dynamically coupling arbitrary user interface objects between heterogeneous applications. In this article we present an architecture for developing distributed collaborative applications using HTML5 and show its usage through the design and implementation of a series of collaborative systems in different scenarios. The experience of developing and using this architecture has shown that it is easy to use, robust and has good performance.

HTML

XML

PDF

]]>
Research Article Fri, 28 Nov 2014 00:00:00 +0200
A Secure Multi-Layer e-Document Method for Improving e-Government Processes https://lib.jucs.org/article/23647/ JUCS - Journal of Universal Computer Science 20(11): 1583-1604

DOI: 10.3217/jucs-020-11-1583

Authors: Gia Vo, Richard Lai

Abstract: In recent years, there has been a tremendous growth in e-Government services due to advances in Information Communication Technology and the number of citizens engaging in e-Government transactions. In government administration, it is very time consuming to process different types of documents and there are many data input problems. There is also a need to satisfy citizens’ requests to retrieve government information and to link these requests to build an online document without asking the citizen to input the data more than once. To provide an e-Government service which is easy to access, fast and secure, the e-Document plays an important role in the management and interoperability of e-Government Systems. To meet these challenges, this paper presents a Secure Multilayer e-Application (SMeA) method for improving e-Government processes. This method involves five steps: namely (i) identifying an e-Template; (ii) building a SMeA; (iii) mapping the data; (iv) processing the e-Application; and (v) approving the e-Application. The first step involves requirements analysis and the last four involve data analysis for building a SMeA. To demonstrate its usefulness, we applied SMeA to a case study of an application for a licence to set up a new business in Vietnam.

HTML

XML

PDF

]]>
Research Article Tue, 28 Oct 2014 00:00:00 +0200
A Personalized Approach for Re-ranking Search Results Using User Preferences https://lib.jucs.org/article/23484/ JUCS - Journal of Universal Computer Science 20(9): 1232-1258

DOI: 10.3217/jucs-020-09-1232

Authors: Naglaa Fathy, Tarek Gharib, Nagwa Badr, Abdulfattah Mashat, Ajith Abraham

Abstract: Web search engines provide users with a huge number of results for a submitted query. However, not all returned results are relevant to the user's needs. Personalized search aims at solving this problem by modeling search interests of the user in a profile and exploiting it to improve the search process. One of the challenges in search personalization is how to properly model user's search interests. Another challenge is how to effectively exploit these models to enhance the search quality. In this paper, an effective hybrid personalized re-ranking search approach is proposed by modeling user's search interests in a conceptual user profile, and then exploiting this profile in the re-ranking process. The user profile consists of concepts obtained by hierarchically classifying user's clicked search results into categories. These categories are extracted from the taxonomy of concepts called The Open Directory Project (ODP) where each concept represents a category. Additionally, each concept in the user profile consists of two types of documents; taxonomy document and viewed document. Taxonomy document is used to represent the user general interests as it contains information from web pages originally associated with such ODP category. Viewed document is used to represent the user specific interests as it contains information from web pages clicked by the user. Finally, the re-ranking process of search results is performed by semantically integrating user's general and specific interests from the user profile together with rankings of the traditional search engine. Experimental results show that semantic identification of user's search interests improves re-ranking quality by providing users with the most relevant results at the top of the search results list.

HTML

XML

PDF

]]>
Research Article Mon, 1 Sep 2014 00:00:00 +0300
Unsupervised Structured Data Extraction from Template-generated Web Pages https://lib.jucs.org/article/22946/ JUCS - Journal of Universal Computer Science 20(2): 169-192

DOI: 10.3217/jucs-020-02-0169

Authors: Tomas Grigalis, Antanas Čenys

Abstract: This paper studies structured data extraction from template-generated Web pages. Such pages contain most of structured data on the Web. Extracted structured data can be later integrated and reused in very big range of applications, such as price comparison portals, business intelligence tools, various mashups and etc. It encourages industry and academics to seek automatic solutions. To tackle the problem of automatic structured Web data extraction we present a new approach - structured data extraction based on clustering visually similar Web page elements. Our method called ClustVX combines visual and pure HTML features of Web page to cluster visually similar Web page elements and then extract structured Web data. ClustVX can extract structured data from Web pages where more than one data record is present. With extensive experimental evaluation on three benchmark datasets we demonstrate that ClustVX achieves better results than other state-of-the-art automatic structured Web data extraction methods.

HTML

XML

PDF

]]>
Research Article Sat, 1 Feb 2014 00:00:00 +0200
Development of Navigation Skills through Audio Haptic Videogaming in Learners who are Blind https://lib.jucs.org/article/23965/ JUCS - Journal of Universal Computer Science 19(18): 2677-2697

DOI: 10.3217/jucs-019-18-2677

Authors: Jaime Sánchez, Marcia Campos

Abstract: This study presents the development of a video game with audio and haptic interfaces that allows for the stimulation of orientation and mobility skills in people who are blind through the use of virtual environments. We evaluate the usability and the impact of the use of an audio and haptic-based videogame on the development of orientation and mobility skills in school-age learners who are blind. The results show that the interfaces used in the videogame are usable and appropriately designed, and that the haptic interface is as effective as the audio interface for orientation and mobility purposes.

HTML

XML

PDF

]]>
Research Article Sun, 1 Dec 2013 00:00:00 +0200
Text Analysis for Monitoring Personal Information Leakage on Twitter https://lib.jucs.org/article/23931/ JUCS - Journal of Universal Computer Science 19(16): 2472-2485

DOI: 10.3217/jucs-019-16-2472

Authors: Dongjin Choi, Jeongin Kim, Xeufeng Piao, Pankoo Kim

Abstract: Social networking services (SNSs) such as Twitter and Facebook can be considered as new forms of media. Information spreads much faster through social media than any other forms of traditional news media because people can upload information with no time and location constraints. For this reason, people have embraced SNSs and allowed them to become an integral part of their everyday lives. People express their emotional status to let others know how they feel about certain information or events. However, they are likely not only to share information with others but also to unintentionally expose personal information such as their place of residence, phone number, and date of birth. If such information is provided to users with inappropriate intentions, there may be serious consequences such as online and offline stalking. To prevent information leakages and detect spam, many researchers have monitored e-mail systems and web blogs. This paper considers text messages on Twitter, which is one of the most popular SNSs in the world, to reveal various hidden patterns by using several coefficient approaches. This paper focuses on users who exchange Tweets and examines the types of information that they reciprocate other's Tweets by monitoring samples of 50 million Tweets which were collected by Stanford University in November 2009. We chose an active Twitter user based on "happy birthday" rule and detecting their information related to place to live and personal names by using proposed coefficient method and compared with other coefficient approaches. As a result of this research, we can conclude that the proposed coefficient method is able to detect and recommend the standard English words for non-standard words in few conditions. Eventually, we detected 88,882 (24.287%) more name included Tweets and 14,054 (3.84%) location related Tweets compared by using only standard word matching method.

HTML

XML

PDF

]]>
Research Article Tue, 1 Oct 2013 00:00:00 +0300
Web Search Results Exploration via Cluster-Based Viewes and Zoom-Based Navigation https://lib.jucs.org/article/23893/ JUCS - Journal of Universal Computer Science 19(15): 2320-2346

DOI: 10.3217/jucs-019-15-2320

Authors: Karol Rástočný, Michal Tvarožek, Maria Bielikova

Abstract: Information seeking on the Web has become day-to-day routine for more than two billion human beings most of who use traditional keyword-based search engines. Developers of these search engines stress personalization, prediction of users' next actions and mistake correction. But they are still struggling with results presentation and support for users, who make atypical queries or who do not exactly know what they are looking for. We address these issues via a novel approach for exploring web repositories, which naturally combines user search activities - look up, learning and investigation. We achieve this via view-based navigation in hierarchical clusters and two-dimensional graphs of search results.

HTML

XML

PDF

]]>
Research Article Sun, 1 Sep 2013 00:00:00 +0300
A Generic Architecture for Emotion-based Recommender Systems in Cloud Learning Environments https://lib.jucs.org/article/23855/ JUCS - Journal of Universal Computer Science 19(14): 2075-2092

DOI: 10.3217/jucs-019-14-2075

Authors: Derick Leony, Hugo A. Parada Gélvez, Pedro Muñoz-Merino, Abelardo Pardo, Carlos Delgado-Kloos

Abstract: Cloud technology has provided a set of tools to learners and tutors to create a virtual personal learning environment. As these tools only support basic tasks, users of learning environments are looking for specialized tools to exploit the uncountable learning elements available on the internet. Thus, one of the most common functionalities in cloud-based learning environments is the recommendation of learning elements and several approaches have been proposed to deploy recommender systems into an educational environment. Currently, there is an increasing interest in including affective information into the process to generate the recommendations for the learner; and services offering this functionality on cloud environments are scarce. Hence in this paper, we propose a generic cloud-based architecture for a system that recommends learning elements according to the affective state of the learner. Furthermore, we provide the description of some use cases along with the details of the implementation of one of them. We also provide a discussion on the advantages and disadvantages of the proposal.

HTML

XML

PDF

]]>
Research Article Thu, 1 Aug 2013 00:00:00 +0300
An Item based Geo-Recommender System Inspired by Artificial Immune Algorithms https://lib.jucs.org/article/23814/ JUCS - Journal of Universal Computer Science 19(13): 2013-2033

DOI: 10.3217/jucs-019-13-2013

Authors: Antonio Cabanas-Abascal, Eduardo García-Machicado, Lisardo Prieto-González, Antonio Seco

Abstract: Nowadays, one of the most relevant features provided by in almost every web site is a recommender system. However, they are usually focused on the common characteristics of several items which are shared among the users without taking into account that there are other very important features, such as geo-position. To face this lack of such relevant factors, authors propose the usage of a useful system that will aid in tasks related to pattern detection and fast adaptability to changes: Artificial Immune System. A combination of both systems and the addition of a geographic component will provide a new solution to this problem, which will solve as well these issues as other ones like comparison tasks in big data.

HTML

XML

PDF

]]>
Research Article Mon, 1 Jul 2013 00:00:00 +0300
Semantic Integration of Heterogeneous Data Sources in the MOMIS Data Transformation System https://lib.jucs.org/article/23813/ JUCS - Journal of Universal Computer Science 19(13): 1986-2012

DOI: 10.3217/jucs-019-13-1986

Authors: Maurizio Vincini, Domenico Beneventano, Sonia Bergamaschi

Abstract: In the last twenty years, many data integration systems following a classical wrapper/mediator architecture and providing a Global Virtual Schema (a.k.a. Global Virtual View - GVV) have been proposed by the research community. The main issues faced by these approaches range from system-level heterogeneities, to structural syntax level heterogeneities at the semantic level. Despite the research effort, all the approaches proposed require a lot of user intervention for customizing and managing the data integration and reconciliation tasks. In some cases, the effort and the complexity of the task is huge, since it requires the development of specific programming codes. Unfortunately, due to the specificity to be addressed, application codes and solutions are not frequently reusable in other domains. For this reason, the Lowell Report 2005 has provided the guideline for the definition of a public benchmark for information integration problem. The proposal, called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches), focuses on how the data integration systems manage syntactic and semantic heterogeneities, which definitely are the greatest technical challenges in the field. We developed a Data Transformation System (DTS) that supports data transformation functions and produces query translation in order to push down to the sources the execution. Our DTS is based on MOMIS, a mediator-based data integration system that our research group is developing and supporting since 1999. In this paper, we show how the DTS is able to solve all the twelve queries of the THALIA benchmark by using a simple combination of declarative translation functions already available in the standard SQL language. We think that this is a remarkable result, mainly for two reasons: firstly to the best of our knowledge there is no system that has provided a complete answer to the benchmark, secondly, our queries does not require any overhead of new code.

HTML

XML

PDF

]]>
Research Article Mon, 1 Jul 2013 00:00:00 +0300
Web Resource Sense Disambiguation in Web of Data https://lib.jucs.org/article/23805/ JUCS - Journal of Universal Computer Science 19(13): 1871-1891

DOI: 10.3217/jucs-019-13-1871

Authors: Farzam Matinfar, Mohammadali Nematbakhsh, Georg Lausen

Abstract: This paper introduces the use of WordNet as a resource for RDF web resources sense disambiguation in Web of Data and shows the role of designed system in interlinking datasets in Web of Data and word sense disambiguation scope. We specify the core labelling properties in semantic web to identify the name of entities which are described in web resources and use them to identify the candidate senses for a web resource. Moreover, we define the web resource's context to identify the most appropriate sense for each of the input web resources. Evaluation of the system shows the high coverage of the core labelling properties and the high performance of the sense disambiguation algorithm.

HTML

XML

PDF

]]>
Research Article Mon, 1 Jul 2013 00:00:00 +0300
Teaching Innova Project: the Incorporation of Adaptable Outcomes in Order to Grade Training Adaptability https://lib.jucs.org/article/23627/ JUCS - Journal of Universal Computer Science 19(11): 1500-1521

DOI: 10.3217/jucs-019-11-1500

Authors: Ángel Fidalgo, María Sein-Echaluce, Dolores Lerís, Oscar Castañeda

Abstract: The education project presented in this paper endeavors to study the feasibility of incorporating adaptive systems into LMS systems, by using them both in training & learning process and at work. This case study is aimed at employability and job post improvement. For this purpose, we have created a process that is flexible both to the student pattern (and to the job pattern. The developed process is adaptable both to the student (via the incorporation of an adaptable system with an LMS system) and to the job model (via an adaptable system to the knowledge management). The evaluation was qualitative and measured the process (feasibility to apply adaptive systems) and the efficiency of the method (applicability and employability). The functionality of the specific developed tools allowed us to grade the degree of adaptability in the training process, to dynamically vary the training plan from the student's actions and to identify the resources that best met the job needs.

HTML

XML

PDF

]]>
Research Article Sat, 1 Jun 2013 00:00:00 +0300
Improving Accuracy of Decision Trees Using Clustering Techniques https://lib.jucs.org/article/23091/ JUCS - Journal of Universal Computer Science 19(4): 484-501

DOI: 10.3217/jucs-019-04-0484

Authors: Javier Torres-Niño, Alejandro Rodríguez-González, Ricardo Colomo-Palacios, Enrique Jiménez-Domingo, Giner Alor-Hernández

Abstract: Data mining is an important part of information management technology. Simply put, it is a method to extract and analyze meaningful patterns and correlations in a large relational database. In Data mining, Decision trees are one of the most worldwide used tools for decision support. In the emerging area of Data mining applications, users of data mining tools are faced with the problem of data sets that are comprised of large numbers of features and instances. Such kinds of data sets are not easy to handle for mining because decision trees generally depends on several parameters like dataset used and configuration of the tree itself among others in order to build an accurate model classification. In this work a novel hybrid classifier system is presented for improving accuracy of decision trees using clustering techniques. This system is formed by a clustering algorithm, a decision tree and an optional module for identifying appropriate parameters for the clustering algorithm. These three modules working together are capable to increase the accuracy of the solutions. The validation of the results of this work has been performed using several well-known datasets and applying two decision trees algorithms. The accuracy percentages are compared in order to show our proposal improvement, obtaining good results. Finally two clustering algorithms have been used to compare the accuracy between different proposals.

HTML

XML

PDF

]]>
Research Article Thu, 28 Feb 2013 00:00:00 +0200
Learning to Classify Neutral Examples from Positive and Negative Opinions https://lib.jucs.org/article/23918/ JUCS - Journal of Universal Computer Science 18(16): 2319-2333

DOI: 10.3217/jucs-018-16-2319

Authors: María-Teresa Martín-Valdivia, Arturo Montejo-Ráez, Alfonso Ureña-López, Mohammed Saleh

Abstract: Sentiment analysis is a challenging research area due to the rapid increase of subjective texts populating the web. There are several studies which focus on classifying opinions into positive or negative. Corpora are usually labeled with a star-rating scale. However, most of the studies neglect to consider neutral examples. In this paper we study the effect of using neutral sample reviews found in an opinion corpus in order to improve a sentiment polarity classification system. We have performed different experiments using several machine learning algorithms in order to demonstrate the advantage of taking the neutral examples into account. In addition we propose a model to divide neutral samples into positive and negative ones, in order to incorporate this information into the construction of the final opinion polarity classification system. Moreover, we have generated a corpus from Amazon in order to prove the convenience of the system. The results obtained are very promising and encourage us to continue researching along this line and consider neutral examples as relevant information in opinion mining tasks.

HTML

XML

PDF

]]>
Research Article Tue, 28 Aug 2012 00:00:00 +0300
Weaving Scholarly Legacy Data into Web of Data https://lib.jucs.org/article/23916/ JUCS - Journal of Universal Computer Science 18(16): 2301-2318

DOI: 10.3217/jucs-018-16-2301

Authors: Atif Latif, Muhammad Afzal, Hermann Maurer

Abstract: The Linked Open Data project provides a new publishing paradigm for creating machine readable and structured data on the Web. Currently, the significant presence of data sets describing scholarly publications in the Linked Data cloud underpins the importance of Linked Data for the scientific community and for the open access movement. However, these semantically rich datasets need to be exploited and linked with real time applications. In the project we report on this. We have exploited numerous scholarly datasets and have created semantic links to papers in an online journal, particularly Journal of Universal Computer Science (J.UCS). The J. UCS plays an important part in the computer science publishing community and provides a number of innovative features and datasets to its web users. However, the legacy HTML format in which these features are made available makes it difficult for machines to understand and query. Keeping in mind the impressive benefits of the Linked Open Data project, this paper presents an approach to convert J.UCS legacy HTML data from its current form to machine understandable format (RDF). It also interlinks this data with other important Linked Data resources. The approach developed has successfully disambiguated and interlinked J.UCS authors and publications datasets with DBpedia, DBLP, CiteULike and faceted DBLP. Additionally, triplified and interlinked datasets are made available to the scientific and semantic web community for downloading and posing SPARQL queries. This semantically linked dataset can further be used by researchers and semantic agents to identify semantic associations, to build inferencing systems, and to extract useful knowledge.

HTML

XML

PDF

]]>
Research Article Tue, 28 Aug 2012 00:00:00 +0300
A Review of Mobile Location-based Games for Learning across Physical and Virtual Spaces https://lib.jucs.org/article/23872/ JUCS - Journal of Universal Computer Science 18(15): 2120-2142

DOI: 10.3217/jucs-018-15-2120

Authors: Nikolaos Avouris, Nikoleta Yiannoutsou

Abstract: In this paper we review mobile location-based games for learning. These games are played in physical space, but at the same time, they are supported by actions and events in an interconnected virtual space. Learning in these games is related to issues like the narrative structure, space and game rules and content that define the virtual game space. First, we introduce the theoretical and empirical considerations of mobile location based games, and then we discuss an analytical framework of their main characteristics through typical examples. In particular, we focus on their narrative structure, the interaction modes that they afford, their use of physical space as prop for action, the way this is linked to virtual space and the possible learning impact the game activities have. Finally we conclude with an outline of future trends and possibilities that these kinds of playful activities can have on learning, especially outside school, like in environmental studies and visits in museums and other sites of cultural and historical value.

HTML

XML

PDF

]]>
Research Article Wed, 1 Aug 2012 00:00:00 +0300
The Modelling of a Digital Forensic Readiness Approach for Wireless Local Area Networks https://lib.jucs.org/article/23720/ JUCS - Journal of Universal Computer Science 18(12): 1721-1740

DOI: 10.3217/jucs-018-12-1721

Authors: Sipho Ngobeni, Hein Venter, Ivan Burke

Abstract: Over the past decade, wireless mobile communication technology based on the IEEE 802.11 Wireless Local Area Networks (WLANs) has been adopted worldwide on a massive scale. However, as the number of wireless users has soared, so has the possibility of cybercrime. WLAN digital forensics is seen as not only a response to cybercrime in wireless networks, but also a means to stem the increase of cybercrime in WLANs. The challenge in WLAN digital forensics is to intercept and preserve all the communications generated by the mobile stations and to conduct a proper digital forensic investigation. This paper attempts to address this issue by proposing a wireless digital forensic readiness model designed to monitor, log and preserve wireless network traffic for digital forensic investigations. Thus, the information needed by the digital forensic experts is rendered readily available, should it be necessary to conduct a digital forensic investigation. The availability of this digital information can maximise the chances of using it as digital evidence and it reduces the cost of conducting the entire digital forensic investigation process.

HTML

XML

PDF

]]>
Research Article Thu, 28 Jun 2012 00:00:00 +0300
Information Security Service Culture - Information Security for End-users https://lib.jucs.org/article/23715/ JUCS - Journal of Universal Computer Science 18(12): 1628-1642

DOI: 10.3217/jucs-018-12-1628

Authors: Rahul Rastogi, Rossouw Solms

Abstract: Information security culture has been found to have a profound influence on the compliance of end-users to information security policies and controls in their organization. Similarly, a complementary aspect of information security is the culture of information security managers and developers in the organization. This paper calls this is as the 'information security service culture' (ISSC). ISSC shapes and guides the behaviour of information security managers and developers as they formulate information security policies and controls. Thus, ISSC has profound influence on the nature of these policies and controls and thereby on the interaction of end-users with these artefacts. ISSC is useful in transforming information security managers and developers from their present-day technology-focused approach to an end-user centric approach.

HTML

XML

PDF

]]>
Research Article Thu, 28 Jun 2012 00:00:00 +0300
The Wookie Widget Server: a Case Study of Piecemeal Integration of Tools and Services https://lib.jucs.org/article/23616/ JUCS - Journal of Universal Computer Science 18(11): 1432-1453

DOI: 10.3217/jucs-018-11-1432

Authors: David Griffiths, Mark Johnson, Kris Popat, Paul Sharples, Scott Wilson

Abstract: Apache Wookie (incubating) has generated considerable interest within the context of Technology Enhanced Learning where it was developed, as well as in mobile applications. The origins of the system in providing services for IMS Learning Design are described, together with an introduction to the system's design and functionality. However, the areas where it has had success are distinct from the application area for which it was designed and developed. The implications of this for understanding user needs is analysed by using ideas drawn from sociology. The complexity of the relationship between the context of use and user needs, and the feedback loops between them is discussed, and the role of technological interventions as an element in a discourse is considered. It is proposed that this understanding of users needs, together with the experience of the development and use of Wookie, argues in favour of an interoperability strategy which focuses on relatively small sets of functional requirements, and avoidance where possible of specifications developed for particular application domains: an approach which may be characterised as piecemeal rather than Utopian.

HTML

XML

PDF

]]>
Research Article Fri, 1 Jun 2012 00:00:00 +0300
Security-enhanced Search Engine Design in Internet of Things https://lib.jucs.org/article/23459/ JUCS - Journal of Universal Computer Science 18(9): 1218-1235

DOI: 10.3217/jucs-018-09-1218

Authors: Xiaojun Qian, Xiaoping Che

Abstract: This paper elaborates the challenges in searching imposed by the burgeoning fieldof Internet of Things (IoT). Firstly it overviews the evolution of the new field to its predecessors: searching in the mobile computing, ubiquitous computing and information retrieve. Then,it identifies five research thrusts: architecture design, search locality, real-time, scalability and divulging information. It also sketches several presumptive IoT scenarios, and uses them to iden-tify key capabilities missing in today's systems. On top of these challenging issues, we report our undertaking work – a security-enhanced search engine for Internet of Things based on El-liptic Curve Cryptography (ECC) security protocol. We also report our preliminary experimental results.

HTML

XML

PDF

]]>
Research Article Tue, 1 May 2012 00:00:00 +0300
Goal-Driven Process Navigation for Individualized Learning Activities in Ubiquitous Networking and IoT Environments https://lib.jucs.org/article/23454/ JUCS - Journal of Universal Computer Science 18(9): 1132-1151

DOI: 10.3217/jucs-018-09-1132

Authors: Jian Chen, Qun Jin, Runhe Huang

Abstract: In the study, we propose an integrated adaptive framework to support and facilitate individualized learning through sharing the successful process of learning activities based on similar learning patterns in the ubiquitous learning environments empowered by Internet of Things (IoT). This framework is based on a dynamic Bayesian network that gradually adapts to a target student’s needs and information access behaviours. By analysing the log data of learning activities and extracting students' learning patterns, our analysis results show that most of students often use their preferred learning patterns in their learning activities, and the learning achievement is affected by the learning process. Based on these findings, we try to optimise the process of learning activities using the extracted learning patterns, infer the learning goal of target students, and provide a goal-driven navigation of individualized learning process according to the similarity of the extracted learning patterns.

HTML

XML

PDF

]]>
Research Article Tue, 1 May 2012 00:00:00 +0300
Context-based Ontology Matching: Concept and Application Cases https://lib.jucs.org/article/23452/ JUCS - Journal of Universal Computer Science 18(9): 1093-1111

DOI: 10.3217/jucs-018-09-1093

Authors: Feiyu Lin, Kurt Sandkuhl, Kevin Xu

Abstract: The Internet of Things (IoT) aims at linking smart objects that are relevant to the user and embedding intelligence into the environment. It is more and more accepted in the scientific community and expected by end users, that pervasive services should be able to adapt to the circumstances or situation in which a computing task takes place, and maybe even detect all relevant parameters for this purpose. Work presented in this paper addresses the challenge of bringing together concepts and experiences from two different areas: context modeling and ontology matching. Current work in the field of automatic ontology matching does not sufficiently take into account the context of the user during the matching process. The main contributions of this paper are (1) the introduction of the concept of "context" in the ontology matching process, (2) an approach for context-based semantic matching, which is building on different (weighted) levels of overlap for a better ranking of alignment elements depending on user's context, (3) an evaluation of the context-based matching in experiments and from user's perspectives.

HTML

XML

PDF

]]>
Research Article Tue, 1 May 2012 00:00:00 +0300
Automatic Tag Attachment Scheme based on Text Clustering for Efficient File Search in Unstructured Peer-to-Peer File Sharing Systems https://lib.jucs.org/article/23392/ JUCS - Journal of Universal Computer Science 18(8): 1032-1047

DOI: 10.3217/jucs-018-08-1032

Authors: Ting Qin, Satoshi Fujita

Abstract: In this paper, the authors address the issue of automatic tag attachment to the documents distributed over a P2P network aiming at improving the efficiency of file search in such networks. The proposed scheme combines text clustering with a modified tag extraction algorithm, and is executed in a fully distributed manner. Meanwhile, the optimal cluster number can also be fixed automatically through a distance cost function. We have conducted experiments to evaluate the accuracy of the proposed scheme. The result of experiments indicates that the proposed approach is capable of making effective and efficient tag attachment in real scenarios; i.e., for more than 90% of documents, it attaches the same tags as the ones attached by human reviewers. Moreover, it proofs by the experiments that the optimal cluster number is almost the same as the number of topics from the website.

HTML

XML

PDF

]]>
Research Article Sat, 28 Apr 2012 00:00:00 +0300
Establishing Knowledge Networks via Analysis of Research Abstracts https://lib.jucs.org/article/23388/ JUCS - Journal of Universal Computer Science 18(8): 993-1021

DOI: 10.3217/jucs-018-08-0993

Authors: Mahalakshmi Suryanarayanan, Dilip Sam, Sendhilkumar Selvaraju

Abstract: The extraction and propagation of knowledge inherent in a social network environment is demanding higher significance in research. The knowledge hidden within a social network would be easier to be comprehended if provided in a collective form. In the field of scientific research, such presentation of appreciated knowledge evolved from research communities would aid researchers. In this paper, we propose the evolution of a knowledge network from the information available in digital bibliographic repositories like DBLP [DBLP]. The most important characteristic of this knowledge network would be the comprehension of the proficiency of the scientist in the perspective of an area of research. This is achieved by categorizing the research articles published by an author into specific domains. The quality of the research articles are ascertained by analysing the abstracts within the domain. This analysis is used to determine the quality of the research article in terms of originality, relevancy and thereby, the impact of the article with respect to a research area. This quality measure provides knowledge on the impact of the scientist on the research community is arrived at as a cumulative entity. This knowledge helps in the evolution of the knowledge network from the social network of a research community.

HTML

XML

PDF

]]>
Research Article Sat, 28 Apr 2012 00:00:00 +0300
Wikipedia-Based Semantic Interpreter Using Approximate Top-k Processing and Its Application https://lib.jucs.org/article/23166/ JUCS - Journal of Universal Computer Science 18(5): 650-675

DOI: 10.3217/jucs-018-05-0650

Authors: Jong Kim, Ashwin Kashyap, Sandilya Bhamidipati

Abstract: Proper representation of the meaning of texts is crucial for enhancing many data mining and information retrieval tasks, including clustering, computing semantic relatedness between texts, and searching. Representing of texts in the concept-space derived from Wikipedia has received growing attention recently. This concept-based representation is capable of extracting semantic relatedness between texts that cannot be deduced with the bag of words model. A key obstacle, however, for using Wikipedia as a semantic interpreter is that the sheer size of the concepts derived from Wikipedia makes it hard to efficiently map texts into concept-space. In this paper, we develop an efficient and effective algorithm which is able to represent the meaning of a text by using the concepts that best match it. In particular, our approach first computes the approximate top-k Wikipedia concepts that are most relevant to the given text. We then leverage these concepts for representing the meaning of the given text. The experimental results show that the proposed technique provides significant gains in execution time without causing significant reduction in precision. We then explore the effectiveness of the proposed algorithm on a real world problem. In particular, we show that this novel scheme could be leveraged to boost the effectiveness in finding topic boundaries in a news video.

HTML

XML

PDF

]]>
Research Article Thu, 1 Mar 2012 00:00:00 +0200
RESTifying a Legacy Semantic Search System: Experience and Lessons Learned https://lib.jucs.org/article/22932/ JUCS - Journal of Universal Computer Science 18(2): 286-311

DOI: 10.3217/jucs-018-02-0286

Authors: Guillermo Vega-Gorgojo, Eduardo Gómez-Sánchez, Miguel Bote-Lorenzo, Juan Asensio-Pérez

Abstract: The REST architectural style pursues scalability and decoupling of application components on target architectures, as opposed to the focus on distribution transparency of RPC-based middleware infrastructures. Ongoing debate between REST and RPC proponents evidences the need of comparisons of both approaches, as well as case studies showing the implications in the development of RESTful applications. With this aim, this paper presents a revamped RESTful version of a legacy RPC-based search system of educational tools named Ontoolsearch. The former version suffers from reduced interoperability with third-party clients, limited visibility of interactions and has some scalability issues due to the use of an RPC-based middleware. These limitations are addressed in the RESTful application as a result of applying REST constraints and using the Atom data format. Further, a benchmarking experiment showed that scalability of the RESTful prototype is superior, measuring a ∼3 times increase of peak throughput. In addition, some lessons learned on RESTful design and implementation have been derived from this work that may be of interest for future developments.

HTML

XML

PDF

]]>
Research Article Sat, 28 Jan 2012 00:00:00 +0200
An Intelligent System for Automated Binary Knowledge Document Classification and Content Analysis https://lib.jucs.org/article/30035/ JUCS - Journal of Universal Computer Science 17(14): 1991-2008

DOI: 10.3217/jucs-017-14-1991

Authors: Tzu-An Chiang, Chun-Yi Wu, Charles Trappey, Amy J. C. Trappey

Abstract: Many companies rely on patent engineers to search patent documents and offer recommendations and advice to R and D engineers. Given the increasing number of patent documents filed each year, new means to effectively and efficiently identify and manage technology specific patent documents are required. This research applies a back-propagation artificial neural network (BPANN), a hierarchical ontology technique, and a normalized term frequency (NTF) method to develop an intelligent system for binary knowledge document classification and content analysis. The intelligent system minimizes inappropriate patent document classification and reduces the effort required to search and screen patents for analysis. Finally, this paper uses the design of light emitting diode (LED) lamps as a case study to illustrate and verify the efficiency of automated binary knowledge document classification and content analysis.

HTML

XML

PDF

]]>
Research Article Sat, 1 Oct 2011 00:00:00 +0300
An Inquiry into the Utilization of Behavior of Users in Personalized Web https://lib.jucs.org/article/30023/ JUCS - Journal of Universal Computer Science 17(13): 1830-1853

DOI: 10.3217/jucs-017-13-1830

Authors: Michal Holub, Mária Bieliková

Abstract: Nowadays we see successive transformation of the Web into its personalized form. In order to personalize the content to suit e ach user's requirements we need to acquire the user's interests. Utilization of implicit feedback is the most suitable and unobtrusive way of doing so. In this paper we present various forms of implicit feedback and their application in the estimation of user's interests. We propose a method of link recommendation based on the recorded actions users take while visiting a website. We employ collaborative filtering to predict user interest to unvisited pages. We present an evaluation of our method using the web portal of our faculty where personalized recommendation of links to interesting events is provided for visitors.

HTML

XML

PDF

]]>
Research Article Thu, 1 Sep 2011 00:00:00 +0300
Recommending Open Linked Data in Creativity Sessions using Web Portals with Collaborative Real Time Environment https://lib.jucs.org/article/30016/ JUCS - Journal of Universal Computer Science 17(12): 1690-1709

DOI: 10.3217/jucs-017-12-1690

Authors: Peter Dolog, Frederico Durao, Karsten Jahn, Yujian Lin, Dennis Peitersen

Abstract: In this paper we describe a concept of the recommender system for collaborative real time web based editing in the context of creativity sessions. The collaborative real time editing provides creativity teams of which members are physically distributed with an emulation of the synchronous collaboration where presence of the team members is required simultaneously (e.g., brainstorming, meetings). The concept of recommendation is based on matchmaking the currently performed activities at the user interface and external linked open data provided through SPARQL endpoints. The real time propagation of the changes in editor and recommendation is achieved by reverse AJAX and observer pattern. An experiment in the area of the creativity domain shows that the recommendation in collaborative real time editing activities are useful in task performance, guidance, and inspiration.

HTML

XML

PDF

]]>
Research Article Mon, 1 Aug 2011 00:00:00 +0300
Expertise Recommender System for Scientific Community https://lib.jucs.org/article/30006/ JUCS - Journal of Universal Computer Science 17(11): 1529-1549

DOI: 10.3217/jucs-017-11-1529

Authors: Muhammad Afzal, Hermann Maurer

Abstract: Finding experts in academics as well as in enterprises is an important practical problem. Both manual and automated approaches are employed and have their own pros and cons. On one hand, the manual approaches need extensive human efforts but the quality of data is good, on the other hand, the automated approaches normally do not need human efforts but the quality of service is not as good as in the manual approaches. Furthermore, the automated approaches normally use only one metric to measure the expertise of an individual. For example, for finding experts in academia, the number of publications of an individual is used to discover and rank experts. This paper illustrates both manual and automated approaches for finding experts and subsequently proposes and implements an automated approach for measuring expertise profile in academia. The proposed approach incorporates multiple metrics for measuring an overall expertise level. To visualize a rank list of experts, an extended hyperbolic visualization technique is proposed and implemented. Furthermore, the discovered experts are pushed to users based on their local context. The research has been implemented for Journal of Universal Computer Science (J. UCS) and is available online for the users of J.UCS.

HTML

XML

PDF

]]>
Research Article Fri, 1 Jul 2011 00:00:00 +0300
An Ontology based Agent Generation for Information Retrieval on Cloud Environment https://lib.jucs.org/article/29972/ JUCS - Journal of Universal Computer Science 17(8): 1135-1160

DOI: 10.3217/jucs-017-08-1135

Authors: Yue-Shan Chang, Chao-Tung Yang, Yu-Cheng Luo

Abstract: Retrieving information or discovering knowledge from a well organized data center in general is requested to be familiar with its schema, structure, and architecture, which against the inherent concept and characteristics of cloud environment. An effective approach to retrieve desired information or to extract useful knowledge is an important issue in the emerging information/knowledge cloud. In this paper, we propose an ontology-based agent generation framework for information retrieval in a flexible, transparent, and easy way on cloud environment. While user submitting a flat-text based request for retrieving information on a cloud environment, the request will be automatically deduced by a Reasoning Agent (RA) based on predefined ontology and reasoning rule, and then be translated to a Mobile Information Retrieving Agent Description File (MIRADF) that is formatted in a proposed Mobile Agent Description Language (MADF). A generating agent, named MIRA-GA, is also implemented to generate a MIRA according to the MIRADF. We also design and implement a prototype to integrate these agents and show an interesting example to demonstrate the feasibility of the architecture.

HTML

XML

PDF

]]>
Research Article Thu, 28 Apr 2011 00:00:00 +0300
A Comparison of Different Retrieval Strategies Working on Medical Free Texts https://lib.jucs.org/article/29970/ JUCS - Journal of Universal Computer Science 17(7): 1109-1133

DOI: 10.3217/jucs-017-07-1109

Authors: Markus Kreuzthaler, Marcus Bloice, Lukas Faulstich, Klaus-Martin Simonic, Andreas Holzinger

Abstract: Patient information in health care systems mostly consists of textual data, and free text in particular makes up a significant amount of it. Information retrieval systems that concentrate on these text types have to deal with the different challenges these medical free texts pose to achieve an acceptable performance. This paper describes the evaluation of four different types of information retrieval strategies: keyword search, search performed by a medical domain expert, a semantic based information retrieval tool, and a purely statistical information retrieval method. The different methods are evaluated and compared with respect to its appliance in medical health care systems.

HTML

XML

PDF

]]>
Research Article Fri, 1 Apr 2011 00:00:00 +0300
Towards Classification of Web Ontologies for the Emerging Semantic Web https://lib.jucs.org/article/29959/ JUCS - Journal of Universal Computer Science 17(7): 1021-1042

DOI: 10.3217/jucs-017-07-1021

Authors: Muhammad Fahad, Nejib Moalla, Abdelaziz Bouras, Muhammad Qadir, Muhammad Farukh

Abstract: The massive growth in ontology development has opened new research challenges such as ontology management, search and retrieval for the entire semantic web community. These results in many recent developments, like OntoKhoj, Swoogle, OntoSearch2, that facilitate tasks user have to perform. These semantic web portals mainly treat ontologies as plain texts and use the traditional text classification algorithms for classifying ontologies in directories and assigning predefined labels rather than using the semantic knowledge hidden within the ontologies. These approaches suffer from many types of classification problems and lack of accuracy, especially in the case of overlapping ontologies that share common vocabularies. In this paper, we define an ontology classification problem and categorize it into many sub-problems. We present a new ontological methodology for the classification of web ontologies, which has been guided by the requirements of the emerging Semantic Web applications and by the lessons learnt from previous systems. The proposed framework, OntClassifire, is tested on 34 ontologies with a certain degree of overlapping domain, and effectiveness of the ontological mechanism is verified. It benefits the construction, maintenance or expansion of ontology directories on the semantic web that help to focus on the crawling and improving the quality of search for the software agents and people. We conclude that the use of a context specific knowledge hidden in the structure of ontologies gives more accurate results for the ontology classification.

HTML

XML

PDF

]]>
Research Article Fri, 1 Apr 2011 00:00:00 +0300
A Framework to Evaluate Interface Suitability for a Given Scenario of Textual Information Retrieval https://lib.jucs.org/article/29938/ JUCS - Journal of Universal Computer Science 17(6): 831-858

DOI: 10.3217/jucs-017-06-0831

Authors: Nicolas Bonnel, Max Chevalier, Claude Chrisment, Gilles Hubert

Abstract: Visualization of search results is an essential step in the textual Information Retrieval (IR) process. Indeed, Information Retrieval Interfaces (IRIs) are used as a link between users and IR systems, a simple example being the ranked list proposed by common search engines. Due to the importance that takes visualization of search results, many interfaces have been proposed in the last decade (which can be textual, 2D or 3D IRIs). Two kinds of evaluation methods have been developed: (1) various evaluation methods of these interfaces were proposed aiming at validating ergonomic and cognitive aspects; (2) various evaluation methods were applied on information retrieval systems (IRS) aiming at measuring their effectiveness. However, as far as we know, these two kinds of evaluation methods are disjoint. Indeed, considering a given IRI associated to a given IRS, what happens if we associate this IRI to another IRS not having the same effectiveness. In this context, we propose an IRI evaluation framework aimed at evaluating the suitability of any IRI to different IR scenarios. First of all, we define the notion of IR scenario as a combination of features related to users, IR tasks and IR systems. We have implemented the framework through a specific evaluation platform that enables performing IRI evaluations and that helps end-users (e.g. IRS developers or IRI designers) in choosing the most suitable IRI for a specific IR scenario.

HTML

XML

PDF

]]>
Research Article Mon, 28 Mar 2011 00:00:00 +0300
A Clustering Approach for Collaborative Filtering Recommendation Using Social Network Analysis https://lib.jucs.org/article/29919/ JUCS - Journal of Universal Computer Science 17(4): 583-604

DOI: 10.3217/jucs-017-04-0583

Authors: Manh Pham, Yiwei Cao, Ralf Klamma, Matthias Jarke

Abstract: Collaborative Filtering(CF) is a well-known technique in recommender systems. CF exploits relationships between users and recommends items to the active user according to the ratings of his/her neighbors. CF suffers from the data sparsity problem, where users only rate a small set of items. That makes the computation of similarity between users imprecise and consequently reduces the accuracy of CF algorithms. In this article, we propose a clustering approach based on the social information of users to derive the recommendations. We study the application of this approach in two application scenarios: academic venue recommendation based on collaboration information and trust-based recommendation. Using the data from DBLP digital library and Epinion, the evaluation shows that our clustering technique based CF performs better than traditional CF algorithms.

HTML

XML

PDF

]]>
Research Article Mon, 28 Feb 2011 00:00:00 +0200
Nabuco - Two Decades of Document Processing in Latin America https://lib.jucs.org/article/29884/ JUCS - Journal of Universal Computer Science 17(1): 151-161

DOI: 10.3217/jucs-017-01-0151

Authors: Rafael Lins

Abstract: This paper reports on the Joaquim Nabuco Project, a pioneering work in Latin America on document digitalization, enhancement, compression, indexing, retrieval and network transmission of historical document images.

HTML

XML

PDF

]]>
Research Article Sat, 1 Jan 2011 00:00:00 +0200
The Use of Latent Semantic Indexing to Mitigate OCR Effects of Related Document Images https://lib.jucs.org/article/29879/ JUCS - Journal of Universal Computer Science 17(1): 64-80

DOI: 10.3217/jucs-017-01-0064

Authors: Renato Bulcão-Neto, José Camacho-Guerrero, Marcio Dutra, Álvaro Barreiro, Javier Parapar, Alessandra Macedo

Abstract: Due to both the widespread and multipurpose use of document images and the current availability of a high number of document images repositories, robust information retrieval mechanisms and systems have been increasingly demanded. This paper presents an approach to support the automatic generation of relationships among document images by exploiting Latent Semantic Indexing (LSI) and Optical Character Recognition (OCR). We developed the LinkDI (Linking of Document Images) service, which extracts and indexes document images content, computes its latent semantics, and defines relationships among images as hyperlinks. LinkDI was experimented with document images repositories, and its performance was evaluated by comparing the quality of the relationships created among textual documents as well as among their respective document images. Considering those same document images, we ran further experiments in order to compare the performance of LinkDI when it exploits or not the LSI technique. Experimental results showed that LSI can mitigate the effects of usual OCR misrecognition, which reinforces the feasibility of LinkDI relating OCR output with high degradation.

HTML

XML

PDF

]]>
Research Article Sat, 1 Jan 2011 00:00:00 +0200
Extending the Methods for Computing the Importance of Entity Types in Large Conceptual Schemas https://lib.jucs.org/article/29851/ JUCS - Journal of Universal Computer Science 16(20): 3138-3162

DOI: 10.3217/jucs-016-20-3138

Authors: Antonio Villegas, Antoni Olivé

Abstract: Visualizing and understanding large conceptual schemas requires the use of specific methods. These methods generate clustered, summarized, or focused schemas that are easier to visualize and understand. All of these methods require computing the importance of each entity type in the schema. In principle, the totality of knowledge defined in the schema could be relevant for the computation of that importance but, up to now, only a small part of that knowledge has been taken into account. In this paper, we extend seven existing methods for computing the importance of entity types by taking into account more relevant knowledge de_ned in the structural and behavioural parts of the schema. We experimentally evaluate the original and extended versions of these methods with three large real-world schemas. We present the two main conclusions we have drawn from the experiments.

HTML

XML

PDF

]]>
Research Article Mon, 1 Nov 2010 00:00:00 +0200
Usage-based Object Similarity https://lib.jucs.org/article/29768/ JUCS - Journal of Universal Computer Science 16(16): 2272-2290

DOI: 10.3217/jucs-016-16-2272

Authors: Katja Niemann, Maren Scheffel, Martin Friedrich, Uwe Kirschenmann, Hans-Christian Schmitz, Martin Wolpers

Abstract: Recommender systems are widely used online to support users in finding relevant information. They can be based on different techniques such as content-based and collaborative filtering. In this paper, we introduce a new way of similarity calculation for item-based collaborative filtering. Thereby we focus on the usage of an object and not on the object's users as we claim the hypothesis that similarity of usage indicates content similarity. To prove this hypothesis we use learning objects accessible through the MACE portal where students can query several architectural repositories. For these objects, we generate object profiles based on their usage monitored within MACE. We further propose several recommendation techniques to apply this usagebased similarity calculation in real systems.

HTML

XML

PDF

]]>
Research Article Sat, 28 Aug 2010 00:00:00 +0300
Real-time Analysis of Time-based Usability and Accessibility for Human Mobile-Web Interactions in the Ubiquitous Internet https://lib.jucs.org/article/29745/ JUCS - Journal of Universal Computer Science 16(15): 1953-1972

DOI: 10.3217/jucs-016-15-1953

Authors: Yung Kim

Abstract: In the ubiquitous Internet, human mobile-web interactions can be evaluated with real-time analysis of time-based usability and accessibility with the different types of mobile Internet devices including smart phones (e.g. iPhone, Android phone, etc.). A ubiquitous mobile-web interaction server, accessible with a variety of mobile Internet devices, could be a unified estimation hub in real-time analysis of human-centric mobile-web interactions. We propose the real-time analysis scheme based on real-time estimation of time-based usability and accessibility for human mobile-web interactions with a name-based directory server for social networking in the ubiquitous Internet environment. We present an implementation of a ubiquitous mobile-web directory service and discuss our approach with some empirical results.

HTML

XML

PDF

]]>
Research Article Sun, 1 Aug 2010 00:00:00 +0300
Trust-Oriented Composite Service Selection with QoS Constraints https://lib.jucs.org/article/29727/ JUCS - Journal of Universal Computer Science 16(13): 1720-1744

DOI: 10.3217/jucs-016-13-1720

Authors: Lei Li, Yan Wang, Ee-Peng Lim

Abstract: In Service-Oriented Computing (SOC) environments, service clients interact with service providers for consuming services. From the viewpoint of service clients, the trust level of a service or a service provider is a critical factor to consider in service selection, particularlywhen a client is looking for a service from a large set of services or service providers. However, a invoked service may be composed of other services. The complex invocations in composite services greatly increase the complexity of trust-oriented service selection. In this paper, we propose novel approaches for composite service representation, trust evaluation and trust-oriented com-posite service selection (with QoS constraints). Our experimental results illustrate that compared with the existing approaches our proposed trust-oriented (QoS constrained) composite serviceselection algorithms are realistic and enjoy better efficiency.

HTML

XML

PDF

]]>
Research Article Thu, 1 Jul 2010 00:00:00 +0300
Classification of Software for the Simulation of Light Scattering and Realization within an Internet Information Portal https://lib.jucs.org/article/29677/ JUCS - Journal of Universal Computer Science 16(9): 1176-1189

DOI: 10.3217/jucs-016-09-1176

Authors: Jens Hellmers, Thomas Wriedt

Abstract: Light scattering studies are done by researchers of various scientific areas. As the calculation of the scattering behavior by small particles is rather complex, corresponding programs usually can be used for specific problems only and therefore a multitude of programs have been developed over the years. To enable researchers to find the best fitting one for their scattering problem a categorization scheme for such software is presented here. This scheme is used within an actual project to set up a new internet information portal on the topic of light scattering. The approach for the integration of the scheme as well as the implementation of a corresponding search tool is described in this article.

HTML

XML

PDF

]]>
Research Article Sat, 1 May 2010 00:00:00 +0300
Evaluating Linear XPath Expressions by Pattern-Matching Automata https://lib.jucs.org/article/29643/ JUCS - Journal of Universal Computer Science 16(5): 833-851

DOI: 10.3217/jucs-016-05-0833

Authors: Panu Silvasti, Seppo Sippu, Eljas Soisalon-Soininen

Abstract: We consider the problem of efficiently evaluating a large number of XPath expressions, especially in the case when they define subscriber profiles for filtering of XML documents. For each document in an XML document stream, the task is to determine those profiles that match the document. In this article we present a new general method for filtering with profiles expressed by linear XPath expressions with child operators (/), descendant operators (//), and wildcards (*). This new filtering algorithm is based on a backtracking deterministic finite automaton derived from the classic Aho-Corasick pattern-matching automaton. This automaton has a size linear in the sum of the sizes of the XPath filters, and the worst-case time bound of the algorithm is much less than the time bound of the simulation of linear-size nondeterministic automata. Our new algorithm has a predecessor that can handle child and descendant operators but not wildcards, and has been shown to be extremely efficient when a documenttype definition (DTD) has been used to prune out all the wildcards and most of the descendant operators. But in some cases, such as when the DTD is highly recursive, it may not be possible to prune out all wildcards without producing a too large set of filters. Then it is important to have the full generality of an evaluation algorithm, as presented in this article, that can also handle wildcards.

HTML

XML

PDF

]]>
Research Article Mon, 1 Mar 2010 00:00:00 +0200
SOM Clustering to Promote Interoperability of Directory Metadata: A Grid-Enabled Genetic Algorithm Approach https://lib.jucs.org/article/29640/ JUCS - Journal of Universal Computer Science 16(5): 800-820

DOI: 10.3217/jucs-016-05-0800

Authors: Lei Li, Vijay Vaishnavi, Art Vandenberg

Abstract: Directories provide a general mechanism for describing resources and enabling information sharing within and across organizations. Directories must resolve differing structures and vocabularies in order to communicate effectively, and interoperability of the directories is becoming increasingly important. This study proposes an approach that integrates a genetic algorithm with a neural network based clustering algorithm - Self-Organizing Maps (SOM) - to systematically cluster directory metadata, highlight similar structures, recognize developing patterns of practice, and potentially promote homogeneity among the directories. The proposed approach utilizes the computing power of Grid infrastructure to improve system performance. The study also explores the feasibility of automating the SOM clustering process in a converging domain by incrementally building a stable SOM map with respect to an initial reference set. Empirical investigations were conducted on sets of Lightweight Directory Access Protocol (LDAP) directory metadata. The experimental results show that the proposed approach can effectively and efficiently cluster LDAP directory metadata at the level of domain experts and a stable SOM map can be created for a set of converging LDAP directory metadata.

HTML

XML

PDF

]]>
Research Article Mon, 1 Mar 2010 00:00:00 +0200
Block-based Against Segmentation-based Texture Image Retrieval https://lib.jucs.org/article/29606/ JUCS - Journal of Universal Computer Science 16(3): 402-423

DOI: 10.3217/jucs-016-03-0402

Authors: Mohammad Faizal Ahmad Fauzi, Paul Lewis

Abstract: This paper concerns the best approach to the capture of local texture features for use in content-based image retrieval (CBIR) applications. From our previous work, two approaches have been suggested, the multiscale block-based approach and the automatic texture segmentation approach. Performance comparison as well as advantages and disadvantages of the two methods are presented in this paper. The databases used are the Brodatz and VisTex databases, as well as three museum image collections of various sizes and contents, with each collection presenting different challenges to the CBIR systems. Experimental observations suggest that the two approaches both perform well, with the multiscale technique having the edge in retrieval performance and scale invariance, while the segmentation technique has the edge in lighter computational complexity as well as having the shape information for later purposes. The choice between the two approaches thus depends on application.

HTML

XML

PDF

]]>
Research Article Mon, 1 Feb 2010 00:00:00 +0200
An Approach to Generation of Decision Rules https://lib.jucs.org/article/29579/ JUCS - Journal of Universal Computer Science 16(1): 140-158

DOI: 10.3217/jucs-016-01-0140

Authors: Zhang Mingyi, Li Danning, Zhang Ying

Abstract: Classical classification and clustering based on equivalence relations are very important tools in decision-making. An equivalence relation is usually determined by properties of objects in a given domain. When making decision, anything that can be spoken about in the subject position of a natural sentence is an object, properties of which are fundamental elements of the knowledge of the given domain. This gives the possibility of representing the concept related to a given domain. In general, the information about a set of the objects is uncertain or incomplete. Various approaches representing uncertainty of a concept were proposed. In particular, Zadeh?s fuzzy set theory and Pawlak?s rough set theory have been most influential on this research field. Zadeh characterizes uncertainty of a concept by introducing a membership function and a similarity (fuzzy equivalence) relation of a set of objects. Pawlak then characterizes uncertainty of a concept by union of some equivalence classes of an equivalence relation. As one of particular important and widely used binary relations, equivalence relation plays a fundamental role in classification, clustering, pattern recognition, polling, automata, learning, control inference and natural language understanding, etc.  An equivalence relation is a binary relation with reflexivity, symmetry and transitivity. However, in many real situations, it is not sufficient to consider equivalence relations only. In fact, a lot of relations determined by the attributes of objects do not satisfy transitivity. In particular, information obtained from a domain of objects is not transitive, when we make decision based on properties of objects. Moreover, the information about symmetry of a relation is mostly uncertain. So, it is needed to approximately make decision and reasoning by indistinct concepts. This provokes us to explore a new class of relations, so-called class of fuzzy semi-equivalence relations. In this paper we introduce the notion of fuzzy semi-equivalence relations and study its properties. In particular, a constructive method of fuzzy semi-equivalence classes is presented. Applying it we present approaches to the fuzzyfication of indistinct concepts approximated by fuzzy relative and semi-equivalence classes, respectively. And an application of the fuzzy semi-equivalence relation theory to generate decision rules is outlined.

HTML

XML

PDF

]]>
Research Article Fri, 1 Jan 2010 00:00:00 +0200
Automatically Deciding if a Document was Scanned or Photographed https://lib.jucs.org/article/29564/ JUCS - Journal of Universal Computer Science 15(18): 3364-3375

DOI: 10.3217/jucs-015-18-3364

Authors: Gabriel Pereira e Silva, Rafael Lins, Brenno Miro, Steven Simske, Marcelo Thielo

Abstract: Portable digital cameras are being used widely by students and professionals in different fields as a practical way to digitize documents. Tools such as PhotoDoc enable the batch processing of such documents, performing automatic border removal and perspective correction. A PhotoDoc processed document and a scanned one look very similar to the human eye if both are in true color. However, if one tries to automatically binarize a batch of documents digitized from portable cameras compared to scanners, they have different features. The knowledge of their source is fundamental for successful processing. This paper presents a classification strategy to distinguish between scanned and photographed documents. Over 16,000 documents were tested with a correct classification rate of over 99.96%.

HTML

XML

PDF

]]>
Research Article Mon, 28 Dec 2009 00:00:00 +0200
Layout Analysis for Camera-Based Whiteboard Notes https://lib.jucs.org/article/29561/ JUCS - Journal of Universal Computer Science 15(18): 3307-3324

DOI: 10.3217/jucs-015-18-3307

Authors: Szilárd Vajda, Thomas Plötz, Gernot Fink

Abstract: A domain where, even in the era of electronic document processing, handwriting is still widely used is note-taking on a whiteboard. Such documents are either captured by a pen-tracking device or — which is much more challenging — by a camera. In both cases the layout analysis of realistic whiteboard notes is an open research problem. In this paper we propose a camera-based three-stage approach for the automatic layout analysis of whiteboard documents. Assuming a reasonable foreground-background separation of the handwriting it starts with a locally adaptive binarization followed by connected component extraction. The latter are then automatically classified as representing either simple graphical elements of a mindmap or elementary text patches. In the final stage the text patches are subject to a clustering procedure in order to generate hypotheses for those image regions where textual annotations of the mindmap can be found. In order to demonstrate the effectiveness of the proposed approach we report results of a writer independent experimental evaluation on a data set of mindmap images created by several different writers without any constraints on writing or drawing style.

HTML

XML

PDF

]]>
Research Article Mon, 28 Dec 2009 00:00:00 +0200
Causality Join Query Processing for Data Streams via a Spatiotemporal Sliding Window https://lib.jucs.org/article/29481/ JUCS - Journal of Universal Computer Science 15(12): 2287-2310

DOI: 10.3217/jucs-015-12-2287

Authors: Oje Kwon, Ki-Joune Li

Abstract: Data streams collected from sensors contain a large volume of useful information including causal relationships. Causality join query processing involves retrieving a set of pairs (cause, effect) from streams of data. However, some causal pairs may be omitted from the query result, due to the delay between sensors and the data stream management system, and the limited size of the sliding window. In this paper, we first investigate temporal, spatial, and spatiotemporal aspects of causality join query processing for data streams. Second, we propose several strategies for sliding window management based on these results. The accuracy of the proposed strategies is studied via intensive experimentation. The result shows that we can improve the accuracy of causality join query processing in data streams with respect to the simple FIFO strategy.

HTML

XML

PDF

]]>
Research Article Sun, 28 Jun 2009 00:00:00 +0300
A Flexible Strategy-Based Model Comparison Approach: Bridging the Syntactic and Semantic Gap https://lib.jucs.org/article/29478/ JUCS - Journal of Universal Computer Science 15(11): 2225-2253

DOI: 10.3217/jucs-015-11-2225

Authors: Kleinner Oliveira, Karin Breitman, Toacy Oliveira

Abstract: In this paper we discuss the importance of model comparison as one of the pillars of model-driven development (MDD). We propose an innovative, flexible, model comparison approach, based on the composition of matching strategies. The proposed approach is fully implemented by a match operator that combines syntactical matching rule, synonym dictionary and typographic similarity strategies to a semantic, ontology-based strategy. Ontologies are semantically richer, have greater power of expression than UML models and can be formally verified for consistency, thus providing more reliability and accuracy to model comparison. The proposed approach is presented in the format of a workflow that provides clear guidance to users and facilitates the inclusion of new matching strategies and evolution.

HTML

XML

PDF

]]>
Research Article Mon, 1 Jun 2009 00:00:00 +0300