Latest Articles from JUCS - Journal of Universal Computer Science Latest 21 Articles from JUCS - Journal of Universal Computer Science https://lib.jucs.org/ Fri, 29 Mar 2024 02:44:14 +0200 Pensoft FeedCreator https://lib.jucs.org/i/logo.jpg Latest Articles from JUCS - Journal of Universal Computer Science https://lib.jucs.org/ FPGA Implementation of Fast Binary Multiplication Based on Customized Basic Cells https://lib.jucs.org/article/86282/ JUCS - Journal of Universal Computer Science 28(10): 1030-1057

DOI: 10.3897/jucs.86282

Authors: Abd Al-Rahman Al-Nounou, Osama Al-Khaleel, Fadi Obeidat, Mohammad Al-Khaleel

Abstract: Multiplication is considered one of the most time-consuming and a key operation in wide variety of embedded applications. Speeding up this operation has a significant impact on the overall performance of these applications. A vast number of multiplication approaches are found in the literature where the goal is always to achieve a higher performance. One of these approaches relies on using smaller multiplier blocks which are built based on direct Boolean algebra equations to build large multipliers. In this work, we present a methodology for designing binary multipliers where different sizes customized partial products generation (CPPG) cells are designed and used as smaller building blocks. The sizes of the designed CPPG cells are 2×2, 3×3, 4×4, 5×5, and 6×6. We use these cells to build 8×8, 16×16, 32×32, 64×64, and 128×128 binary multipliers. All of the CPPG cells and the binary multipliers are described using the VHDL language, tested, and implemented using XILINX ISE 14.6 tools targeting different FPGA families. The implementation results show that the best performance is achieved when cell 3×3 is used and Virtex-7 FPGA is targeted. The binary multipliers that are designed using the proposed CPPG cells achieve better performance when compared with the binary multipliers presented in the literature. As an application that utilizes the proposed multiplier, a Multiply-Accumulate (MAC) unit is designed and implemented in Spartan-3E. The implementation results of the MAC unit demonstrate the effectiveness of the proposed multiplier.

HTML

XML

PDF

]]>
Research Article Fri, 28 Oct 2022 10:30:00 +0300
Temporal Accelerators: Unleashing the Potential of Embedded FPGAs https://lib.jucs.org/article/77247/ JUCS - Journal of Universal Computer Science 27(11): 1174-1192

DOI: 10.3897/jucs.77247

Authors: Christopher Cichiwskyj, Gregor Schiele

Abstract: When the complexity of a problem rises, its solution requires more hardware resources. A usual way to solve this is to use larger processors and add more memory. When using Field Programmable Gate-Arrays (FPGAs), which can instantiate arbitrary circuit designs, a larger, more costly and power hungry chip is used. In this paper we propose a different approach, namely to split the problem into a graph of interdependent smaller tasks and to reconfigure a small FPGA during runtime to execute each of these tasks efficiently sequentially. This can result in cheaper and more energy efficient systems that can execute very complex problems locally. We present a basic analytical model, evaluate its accuracy and discuss initial insight from it.

HTML

XML

PDF

]]>
Research Article Sun, 28 Nov 2021 10:00:00 +0200
A reversible circuit synthesis algorithm with progressive increase of controls in generalized Toffoli gates https://lib.jucs.org/article/69617/ JUCS - Journal of Universal Computer Science 27(6): 544-563

DOI: 10.3897/jucs.69617

Authors: Edinelço Dalcumune, Luis Antonio Brasil Kowada, André da Cunha Ribeiro, Celina Miraglia Herrera de Figueiredo, Franklin de Lima Marquezino

Abstract: We present a new algorithm for synthesis of reversible circuits for arbitrary n-bit bijective functions. This algorithm uses generalized Toffoli gates, which include positive and negative controls. Our algorithm is divided into two parts. First, we use partially controlled gen- eralized Toffoli gates, progressively increasing the number of controls. Second, exploring the properties of the representation of permutations in disjoint cycles, we apply generalized Toffoli gates with controls on all lines except for the target line. Therefore, new in the method is the fact that the obtained circuits use first low cost gates and consider increasing costs towards the end of the synthesis. In addition, we employ two bidirectional synthesis strategies to improve the gate count, which is the metric used to compare the results obtained by our algorithm with the results presented in the literature. Accordingly, our experimental results consider all 3-bit bijective functions and twenty widely used benchmark functions. The results obtained by our synthesis algorithm are competitive when compared with the best results known in the literature, considering as a complexity metric just the number of gates, as done by alternative best heuristics found in the literature. For example, for all 3-bit bijective functions using generalized Toffoli gates library, we obtained the best so far average count of 5.23.

HTML

XML

PDF

]]>
Research Article Mon, 28 Jun 2021 10:00:00 +0300
Dynamic Estimation of Temporary Failure in SoC FPGAs for Heterogeneous Applications https://lib.jucs.org/article/23789/ JUCS - Journal of Universal Computer Science 24(12): 1776-1799

DOI: 10.3217/jucs-024-12-1776

Authors: J. Kokila, N. Ramasubramanian, Ravindra Thamma

Abstract: Recent processors are shrinking in size due to the advancement of technology. Reliability is an important design parameter along with power, cost, and performance. The processors need to be fault tolerant to counter reliability challenges. This work proposes a dynamic thermal and voltage management (DTVM) system which ensures a reasonable level of fault tolerance. The fault tolerance system (FTS) identifies and subsequently can forecast temporary failures at run-time. The temporary failures are dynamically estimated on SoC FPGAs for a class of heterogeneous applications. The dynamic priority scheduling based on absolute deadline is adopted to improve the nature of FTS. Experimental results indicate that the failure rate reduces by 7.2% with the variation of 2% and 12% in temperature and voltage respectively.

HTML

XML

PDF

]]>
Research Article Fri, 28 Dec 2018 00:00:00 +0200
A Fine-Grained Hardware Security Approach for Runtime Code Integrity in Embedded Systems https://lib.jucs.org/article/23154/ JUCS - Journal of Universal Computer Science 24(4): 515-536

DOI: 10.3217/jucs-024-04-0515

Authors: Xiang Wang, Weike Wang, Bin Xu, Pei Du, Lin Li, Muyang Liu

Abstract: Embedded systems are subjected to various adversaries including software attacks, physical attacks, and side channel attacks. Most of these malicious attacks can lead to the invalid execution of programs, and launch of destructive actions or reveal critical information. However, most previous security mechanisms suffer from coarse checking granularity and unacceptable performance overhead, due to strict restriction on system resources. This paper presents a fine-grained hardware-based security approach to ensure runtime code integrity in the embedded systems by offline profiling of the program features and runtime integrity check. We design a hardware implemented instruction stream integrity checker (ISIC) to perform runtime checking of pre-extracted features. Any invalid execution of the program will trigger the corresponding exception signal. We implement the ISIC with OR1200 processor on XC5VLX50T field-programmable gate array (FPGA). The experimental results show that the proposed approach can detect all the attacks destructing integrity of the instruction stream, and the performance overhead induced by the security mechanism is less than 3.45% according to the selected benchmarks.

HTML

XML

PDF

]]>
Research Article Sat, 28 Apr 2018 00:00:00 +0300
Fast Self-Reconfigurable Embedded System on Spartan-3 https://lib.jucs.org/article/23009/ JUCS - Journal of Universal Computer Science 19(3): 301-324

DOI: 10.3217/jucs-019-03-0301

Authors: Enrique Cantó, Mariano Fons, Francesc Fons, Mariano López, Rafael Ramos

Abstract: Many image-processing algorithms require several stages to be processed that cannot be resolved by embedded microprocessors in a reasonable time, due to their high-computational cost. A set of dedicated coprocessors can accelerate the resolution of these algorithms, although the main drawback is the area needed for their implementation. The main advantage of a reconfigurable system is that several coprocessors designed to perform different operations can be mapped on the same area in a time-multiplexed way. This work presents the architecture of an embedded system composed of a microprocessor and a run-time reconfigurable coprocessor, mapped on Spartan-3, the low-cost family of Xilinx FPGAs. Designing reconfigurable systems on Spartan-3 requires much design effort, since unlike higher cost families of Xilinx FPGAs, this device does not officially support partial reconfiguration. In order to overcome this drawback, the paper also describes the main steps used in the design flow to obtain a successful design. The main goal of the presented architecture is to reduce the coprocessor reconfiguration time, as well as accelerate image-processing algorithms. The experimental results demonstrate significant improvement in both objectives. The reconfiguration rate nearly achieves 320 Mb/s which is far superior to the previous related works.

HTML

XML

PDF

]]>
Research Article Fri, 1 Feb 2013 00:00:00 +0200
The Forum for Negative Results (FNR)Guest Editorial https://lib.jucs.org/article/23977/ JUCS - Journal of Universal Computer Science 18(20): 2748-2749

DOI: 10.3217/jucs-018-20-2748

Authors: Lutz Prechelt

Abstract: In September 1997, J.UCS published an article titled "Why we Need an Explicit Forum for Negative Results" [Prechelt, 1997]. It argued that when a plausible approach for solving a computer science or software engineering problem had failed to work out, it was silly for the scientific system not to publish the attempt iff a useful insight had been gained along the way nevertheless. Due to the strong bias of essentially all Computer Science publication venues towards "successful" research results, it was thus required to call for such negative results explicitly in order to avoid that those results would either be misleadingly disguised as successes or disappear in some closet. The article declared that J.UCS had thus agreed to create the "Forum for Negative Results (FNR)" as a permanent special section of J.UCS.

HTML

XML

PDF

]]>
Research Article Sat, 1 Dec 2012 00:00:00 +0200
Design of Arbiters and Allocators Based on Multi-Terminal BDDs https://lib.jucs.org/article/29734/ JUCS - Journal of Universal Computer Science 16(14): 1826-1852

DOI: 10.3217/jucs-016-14-1826

Authors: Václav Dvořák, Petr Mikušek

Abstract: Assigning one (more) shared resource(s) to several requesters is a function of arbiters (allocators). This class of decision-making modules can be implemented in a number of ways, from hardware to firmware to software. The paper presents a new computer-aided technique that can produce representations of arbiters/allocators in a form of a Multi-Terminal Binary Decision Diagram (MTBDD) with close to minimum cost and width. This diagram can then serve as a prototype for a cascade of multiple-output look-up tables (LUTs) that implements the given function, or for efficient firmware implementation. The technique makes use of iterative decomposition of integer functions of Boolean variables and a variable-ordering heuristic to order variables. The LUT cascades lead directly to the pipelined design, simplify wiring and testing and can compete with the traditional FPGA design in performance and with PLA design in chip area.

HTML

XML

PDF

]]>
Research Article Wed, 28 Jul 2010 00:00:00 +0300
An IP Core and GUI for Implementing Multilayer Perceptron with a Fuzzy Activation Function on Configurable Logic Devices https://lib.jucs.org/article/29084/ JUCS - Journal of Universal Computer Science 14(10): 1678-1694

DOI: 10.3217/jucs-014-10-1678

Authors: Alfredo Rosado-Muñoz, Luis Gomez-Chova, Joan Francés

Abstract: This paper describes the development of an Intellectual Property (IP) core in VHDL able to implement a Multilayer Perceptron (MLP) artificial neural network (ANN) topology with up to 2 hidden layers, 128 neurons, and 31 inputs per neuron. Neural network models are usually developed by using programming languages, such as Matlab®. However, their implementation in configurable logic hardware requires the use of some other tools and hardware description languages, such as as VHDL. For easy migration, a Matlab Graphical User Interface (GUI) to automatically translate the ANN architecture to VHDL code has been developed. In addition, the use of an activation function based on fuzzy logic for the implementation of the MLP neural network simplifies the logic and improves the results. The environment was tested using a typical prediction problem, the Mackey-Glass series, where several ANN topologies were generated, tested and implemented in an FPGA. Results show the excellent agreement between the results provided by the software model and the hardware implementation.

HTML

XML

PDF

]]>
Research Article Wed, 28 May 2008 00:00:00 +0300
Design and Implementation of the AMCC Self-Timed Microprocessor in FPGAs https://lib.jucs.org/article/28750/ JUCS - Journal of Universal Computer Science 13(3): 377-387

DOI: 10.3217/jucs-013-03-0377

Authors: Susana Ortega-Cisneros, Juan Raygoza-Panduro, Alberto de la Mora Gálvez

Abstract: The development of processors with full custom technology has some disadvantages, such as the time used to design the processors and the cost of the implementation. In this article we used the programmable circuits FPGA such as an option of low cost for the development and implementation of Self-Timed (ST) systems. In addition it describes the architecture and the modules that compose the Asynchronous Microprocessor of Centralized Control (AMCC), and reviews the results of the occupation in the implementation of the FPGA. The operation of this processor only requires of an external pulse to the input of the first asynchronous control block, and with this pulse the sequence of request-recognition of the control unit begins, that it activates the cycle search and it begins the process of execution of the instructions, without the need of having a clock feeding the system. Once concluded the program, the microprocessor stops and include inherently the stoppable clock feature; i.e., circuit is stopped if it is not required (minimal dynamic consumption). Until it is activated again by an external request signal.

HTML

XML

PDF

]]>
Research Article Wed, 28 Mar 2007 00:00:00 +0300
Reversible Karatsuba's Algorithm https://lib.jucs.org/article/28614/ JUCS - Journal of Universal Computer Science 12(5): 499-511

DOI: 10.3217/jucs-012-05-0499

Authors: Luis Antonio Brasil Kowada, Renato Portugal, Celina Miraglia Herrera de Figueiredo

Abstract: Karatsuba discovered the first algorithm that accomplishes multiprecision integer multiplication with complexity below that of the grade-school method. This algorithm is implemented nowadays in computer algebra systems using irreversible logic. In this paper we describe reversible circuits for the Karatsuba's algorithm and analyze their computational complexity. We discuss garbage disposal methods and compare with the well known Bennett's schemes. These circuits can be used in reversible computers which have the advantage of being very efficient in terms of energy consumption. The algorithm can also be used in quantum computers and is an improvement of previous circuits for the same purpose described in the literature.

HTML

XML

PDF

]]>
Research Article Sun, 28 May 2006 00:00:00 +0300
A Modular Architecture for Nodes in Wireless Sensor Networks https://lib.jucs.org/article/28587/ JUCS - Journal of Universal Computer Science 12(3): 328-339

DOI: 10.3217/jucs-012-03-0328

Authors: Jorge Portilla, Angel De Castro, Eduardo Torre, Teresa Riesgo

Abstract: The growth of sensor networks during the last years is a fact and within this field, wireless sensor networks are growing particularly as there are many applications that demand the use of many nodes, even hundreds or thousands. More and more applications are emerging to solve several problems in data acquisition and control in different environments, taking advantage of this technology. In this context, hardware design of the sensor network node becomes critical to satisfy the hard constraints imposed by wireless sensor networks, like low power consumption, low size and low cost. Moreover, these nodes must be capable of sensing, processing and communicating physical parameters, becoming true smart sensors in a network. With this goal in mind, we propose a modular architecture for the nodes, composed of four layers: communication, processing, power supply and sensing. The purpose is to minimize the redesign effort as well as to make the node flexible and adaptable to many different applications. In a first prototype of the node, we present a node with a mixed design based on a microcontroller and an FPGA for the processing layer and Bluetooth technology for communications.

HTML

XML

PDF

]]>
Research Article Tue, 28 Mar 2006 00:00:00 +0300
Hardware Design and Functional Programming: a Perfect Match https://lib.jucs.org/article/28436/ JUCS - Journal of Universal Computer Science 11(7): 1135-1158

DOI: 10.3217/jucs-011-07-1135

Authors: Mary Sheeran

Abstract: This paper aims to explain why I am still fascinated by the use of functional languages in hardware design. I hope that some readers will be tempted to tackle some of the hard problems that I outline in the final section. In particular, I believe that programming language researchers have much to contribute to the field of hardware design.

HTML

XML

PDF

]]>
Research Article Thu, 28 Jul 2005 00:00:00 +0300
Function-Complete Lookahead in Support of Efficient SAT Search Heuristics https://lib.jucs.org/article/28327/ JUCS - Journal of Universal Computer Science 10(12): 1655-1692

DOI: 10.3217/jucs-010-12-1655

Authors: John Franco, Michal Kouril, John Schlipf, Sean Weaver, Michael Dransfield, W. Vanfleet

Abstract: Recent work has shown the value of using propositional SAT solvers, as opposed to pure BDD solvers, for solving many real-world Boolean Satisfiability problems including Bounded Model Checking problems (BMC). We propose a SAT solver paradigm which combines the use of BDDs and search methods to support efficient implementation of complex search heuristics and effective use of early (preeprocessor) learning. We implement many of these ideas in software called SBSAT. We show that SBSAT solves many of the benchmarks tested competitively or substantially faster than state-of-the-art SAT solvers. SBSAT differs from standard propositional SAT solvers by working directly with non-CNF propositional input, its input format is BDDs. This allows some BDD-style processing to be used as a preprocessing tool. After preprocessing, the BDDs are transformed into state machines (different state machines than the ones used in the original model checking problem) and a good deal of lookahead information is precomputed and memoized. This provides for fast implementation of a new form of look ahead, called local-function-complete lookahead (contrasting with the depth-first lookahead of zChaff [Moskewicz et al. 01] and the breadth-first lookahead of Prover [Stålmarck 94]). SBSAT provides a choice of search heuristics, allowing users to exploit domain-specific experience. We describe SBSAT in this paper. We use SBSAT in conjunction with the tool bmc from Carnegie Mellon to translate a bounded model checking problem to classical propositional logic and then use SBSAT to solve the bmc output. We show this approach is faster than the now traditional approach of translating the bmc output to CNF clauses and using a CNF-based SAT solver, such as zChaff. The work continues that of [Franco et al. 01] and [Franco et al. 04].

HTML

XML

PDF

]]>
Research Article Tue, 28 Dec 2004 00:00:00 +0200
Using Global Structural Relationships of Signals to Accelerate SAT-based Combinational Equivalence Checking https://lib.jucs.org/article/28324/ JUCS - Journal of Universal Computer Science 10(12): 1597-1628

DOI: 10.3217/jucs-010-12-1597

Authors: Rajat Arora, Michael Hsiao

Abstract: We propose a novel technique to improve SAT-based Combinational Equivalence Checking (CEC). The idea is to perform a low-cost preprocessing that will statically induce global signal relationships into the original CNF formula of the miter circuit under verification, and hence reduce the complexity of the SAT instance. This efficient and effective preprocessing quickly builds up the implication graph for the miter circuit under verification, yielding a large set of direct, indirect and extended backward implications. These two-node implications spanning the entire circuit are converted into binary clauses, and they are added to the miter CNF formula. The added clauses constrain the search space of the SAT solver and provide correlation among the different variables, which enhances the Boolean Constraint Propagation (BCP). Experimental results on large and difficult ISCAS'85, ISC AS'89 (full scan) and ITC'99 (full scan) CEC instances show that our approach is independent of the state-of-the-art SAT solver used, and that the added clauses help to achieve not eworthy speedup for each of the cases. Also, comparison with Hyper-Resolution (Hypre), Non-Increasing Variable Elimination Resolution (NIVER) and the propositional formula checker HeerHugo, suggests that our technique is more powerful, yielding non-trivial clauses that significantly simplify the SAT instance complexity.

HTML

XML

PDF

]]>
Research Article Tue, 28 Dec 2004 00:00:00 +0200
Implementation of an Embedded Hardware Description Language Using Haskell https://lib.jucs.org/article/28083/ JUCS - Journal of Universal Computer Science 9(8): 795-812

DOI: 10.3217/jucs-009-08-0795

Authors: Nélio Muniz Mendes Alves, Sérgio Schneider

Abstract: This paper describes an ongoing implementation of an embedded hardware description language (HDL) using Haskell as a host language. Traditionally, functional HDL s are made using lazy lists to model signals, so circuits are functions from lists of input values to lists of output values. We use another known approach for embedded languages, in which circuits are data structures rather than functions. This style of implementation permits one to inspect the structure of the circuit, allowing one to perform different interpretations for the same description. The approach we present can also be applied to other domain-specific embedded languages. We provide an elegant implementation of memories and a set of new signal types.

HTML

XML

PDF

]]>
Research Article Thu, 28 Aug 2003 00:00:00 +0300
Group Theoretical Aspects of Reversible Logic Gates https://lib.jucs.org/article/27559/ JUCS - Journal of Universal Computer Science 5(5): 307-321

DOI: 10.3217/jucs-005-05-0307

Authors: Leo Storme, Alexis Vos, Gerald Jacobs

Abstract: Logic gates with three input bits and three output bits have a privileged position within fundamental computer science: they are a sufficient building block for constructing arbitrary reversible boolean networks and therefore are the key to reversible digital computers. Such computers can, in principle, operate without heat production. As there exist as many as 8! = 40,320 different 3-bit reversible truth tables, the question arises as to which ones to choose as building blocks. Because these gates form a group with respect to the operation "cascading" , we can apply group theoretical tools, in order to make such a choice.

HTML

XML

PDF

]]>
Research Article Fri, 28 May 1999 00:00:00 +0300
The Average Case Performance of an Algorithm for Demand-Driven Evaluation of Boolean Formulae https://lib.jucs.org/article/27558/ JUCS - Journal of Universal Computer Science 5(5): 288-306

DOI: 10.3217/jucs-005-05-0288

Authors: Paul Dunne, Paul Leng

Abstract: Demand-driven simulation is an approach to the simulation of digital logic circuits that was proposed, independently, in the work of several authors. Experimental studies of the paradign have indicated that this approach may reduce the time required for simulation, when compared with event-driven techniques. In this paper we present some analytic support for these experimental results by analysing the average number of gates evaluated with a naive demand-driven algorithm for formula evaluation.

HTML

XML

PDF

]]>
Research Article Fri, 28 May 1999 00:00:00 +0300
Why We Need an Explicit Forum for Negative Results https://lib.jucs.org/article/27412/ JUCS - Journal of Universal Computer Science 3(9): 1074-1083

DOI: 10.3217/jucs-003-09-1074

Authors: Lutz Prechelt

Abstract: Current Computer Science (CS) research is primarily focused on solving engineering problems. Often though, promising attempts for solving a particular problem fail for non-avoidable reasons. This is what I call a negative result: something that should have worked does not. Due to the current CS publication climate such negative results today are usually camouflaged as positive results by non-evaluating or mis-evaluating the research or by redefining the problem to fit the solution. Such publication behavior hampers progress in CS by suppressing some valuable insights, producing spurious understanding, and misleading further research efforts. Specific examples given below illustrate and back up these claims. This paper is the announcement of a (partial) remedy: a permanent publication forum explicitly for negative CS research results, called the Forum for Negative Results, FNR. FNR will be a regular part of J.UCS.

HTML

XML

PDF

]]>
Research Article Sun, 28 Sep 1997 00:00:00 +0300
Prototyping on the PC with Programmable Hardware https://lib.jucs.org/article/27332/ JUCS - Journal of Universal Computer Science 3(2): 86-119

DOI: 10.3217/jucs-003-02-0086

Authors: Jamaludin Omar, James Noras

Abstract: This paper describes how to design and use a framework of hardware and software for flexible interfacing and prototyping on the PC. The hardware comprises a card with programmable hardware provided by FPGAs, with an interface including DMA block transfer and interrupts. A library of hardware macros is described. Software routines are provided to enable the FPGAs to be programmed and to allow communication between the host PC and the peripheral card. Examples are given to show its use in building and testing designs, so that new applications can be prototyped quickly using a proven and reliable interface.

HTML

XML

PDF

]]>
Research Article Fri, 28 Feb 1997 00:00:00 +0200
Bounds on Size of Decision Diagrams https://lib.jucs.org/article/27321/ JUCS - Journal of Universal Computer Science 3(1): 2-22

DOI: 10.3217/jucs-003-01-0002

Authors: Václav Dvořák

Abstract: Known upper bounds on the number of required nodes (size) in the ordered binary and multiple-valued decision diagram (DD) for representation of logic functions are reviewed and reduced by a small constant factor. New upper bounds are derived for partial logic functions containing don t cares and also for complete Boolean functions specified by Boolean expressions. The evaluation of upper bounds is based on a bottom-up algorithm for constructing efficient ordered DDs developed by the author.

HTML

XML

PDF

]]>
Research Article Tue, 28 Jan 1997 00:00:00 +0200