required, produce a time varying waveform by the application of variable voltage to the input of the voltage-controlled oscillator.

References

1 Introduction
IN BIOMEDICAL engineering computing time is often a well known limitation in real-time signal processing, especially when microcomputers are involved.

Computer-assisted monitoring of ECG, evoked response analysis of EEG and power-density analysis of EMG for instance require signal-processing operations such as digital filtering, spectral analysis and pattern recognition. Within these operations convolution and correlation operations are used extensively (Jones, 1982). The reduction of the number of time-consuming operations has often led to considerable concessions with regard to the performance of the algorithms concerned. Sometimes special equipment has been developed to perform a specific task such as digital filtering (Kolb, 1983). We have studied several operations in order to identify and isolate their computer time-consuming components.

The convolution sum, used to calculate the output signal $y(k)$ of a nonrecursive digital filter with an impulse response can be described by eqn. 1, where $u(k)$ is the input signal and $h(i)$ is the impulse response.

$$y(k) = \sum_{i=0}^{N-1} h(i)u(k - i) \quad (1)$$

The discrete Fourier transform $X(j\omega)$ of a sampled signal $x(k)$ can be derived using eqn. 2:

$$X(j\omega) = \sum_{i=0}^{N-1} x(i) \cos (\omega i/N) - j \sum_{i=0}^{N-1} x(i) \sin (\omega i/N) \quad (2)$$

Technical note
VIPER II, an improved parallel processor for real-time calculation of inner products
J. A. van Alstè A. J. Mulder
Biomedical Engineering Division, Department of Electrical Engineering, Twente University of Technology, PO Box 217, 7500 AE Enschede, The Netherlands

Keywords—Digital filtering, Inner products, Parallel processor, Q-bus, Signal processing


First received 18th November 1985 and in final form 4th July 1986 © IFMBE 1987
The correlation coefficient \( r \) can be used to recognise patterns described by a template \( k(i) \) in a one-dimensional signal \( y(k) \) as shown in eqn. 3. It is expected that \( k \) and \( y \) will average zero during the observed interval.

\[
 r = \frac{\left( \sum_{i=0}^{N-1} k(i)y(i) \right)}{\sqrt{\left( \sum_{i=0}^{N-1} k(i)^2 \right) \left( \sum_{i=0}^{N-1} y(i)^2 \right)}}^{-1/2}
\]  (3)

The computer time-consuming part of eqns. 1–3 comprises inner products of two vectors as described in eqn. 4.

\[
\text{inner product } U \cdot V = \sum_{i=0}^{N-1} u(i)v(i)
\]  (4)

where

\[
U = u(0), u(1), \ldots, u(N - 1)
\]

and

\[
V = v(0), v(1), \ldots, v(N - 1)
\]

In signal processing often one or both vectors \( U \) and \( V \) represent part of a time-dependent signal. In such cases it is convenient to store these vectors in so-called circular buffers. In this way the data manipulation can be restricted to one vector element every sample interval, i.e. exchanging the oldest element for the newest.

Because of the relatively slow execution of the integer multiply instruction, the inner product is very time consuming. Furthermore, the data manipulations needed for the inner product calculation take a considerable amount of time.

To overcome this bottleneck we have developed a special-purpose processor for the fast calculations of inner products. The processor is partly based on experience with a predecessor (Van Alsté and Luursema, 1985) but having less elaborate features. The apparatus is called VIPER-II, like its predecessor VIPER (Vector Inner Product Equipment for Real time). It operates in parallel with a DEC LSI processor and is connected to the Q-bus by its own built-in interface. Using VIPER-II, inner products are no longer a load for the processor so the processor stays available for other data and control manipulations.

VIPER-II calculates a number of inner products of vectors consisting of 16-bit integer-valued arrays of which the length may vary from 64 to 512 elements. The vectors may be loaded or changed any time between calculations, which makes VIPER-II suitable for the implementations of, for instance, adaptive filters or correlators. The data exchange is controlled by means of a control/status register and interrupt facilities, and is minimised by the use of circular buffers.

2 Principle of operation

VIPER-II is realised as a special-purpose processor that operates on the Q-bus parallel to the LSI-11 processor. It contains two separate random access memory (RAM) banks of 4096 x 16 bits which are switched antiparallel as shown in Fig. 1.

One of the RAM banks is connected to VIPER-II's processor and the other is accessible for the LSI-11 processor like normal computer RAM. After calculation of the inner products in one RAM bank, this RAM bank is switched to the Q-bus and the other bank to the VIPER-II processor.

Each RAM bank has to contain a coherent set of vectors and a control microcode to match. The inner product results are also stored in this bank. The memory maps of both RAM banks are equal and shown in Fig. 2.

2 Principle of operation

VIPER-II is realised as a special-purpose processor that operates on the Q-bus parallel to the LSI-11 processor. It contains two separate random access memory (RAM) banks of 4096 x 16 bits which are switched antiparallel as shown in Fig. 1.

One of the RAM banks is connected to VIPER-II's processor and the other is accessible for the LSI-11 processor like normal computer RAM. After calculation of the inner products in one RAM bank, this RAM bank is switched to the Q-bus and the other bank to the VIPER-II processor.

Each RAM bank has to contain a coherent set of vectors and a control microcode to match. The inner product results are also stored in this bank. The memory maps of both RAM banks are equal and shown in Fig. 2.

The inner product processor consists of a fast multiplier/accumulator integrated circuit with control and vector element address generation logic. The switches, comprising the multiplexers \( S_1 \) and \( S_2 \), are controlled (antiparallel) by the LSI-11 processor by means of the control/status register. This register is also used to actually start the inner product calculations.

The elements of a vector have to be stored at successive addresses in the RAM bank. The microcode program consisting of up to 64 microcode instructions, each as shown

![Fig. 1 Functional block diagram of VIPER-II](image-url)

![Fig. 2 Memory map of each RAM bank](image-url)
in Fig. 3, controls the calculation of the inner products. Each microcode word provides the highest address bits of the two vector memory fields involved in the specific inner product, the vector length and whether each of the vectors is stored as a circular buffer or not. Address generation logic constructs the actual vector element addresses needed from these data. This enables for example the repetitive use of certain vectors within one calculation run, and the changing of microcode instructions between subsequent calculation runs.

Fig. 3 The microcode word format

4 Characteristics

The characteristics of VIPER-II are summarised as follows. VIPER-II is a plug-in quad-slot Q-bus module. It comprises a special-purpose parallel processor for inner vector products, which is microprogrammable. Data and program are stored in two alternatingly accessible RAM banks. The vectors may consist of 64, 128, 256 or 512 x 16-bit integer elements. The product results are internally calculated in 35-bit and output as 32-bit integers. In one calculation run, a maximum of 61 products of vectors having a length of 64 elements to seven products of vectors having a length of 512 elements can be calculated.

3 Circuit description

The block diagram of Fig. 4 represents the total apparatus. The multiplier/accumulator comprises a single large-scale integrated circuit, the ADSP 1010 KD which is manufactured by Analog Devices (Analog Devices). This CMOS multiplier/accumulator obtains its data from and stores its results in the 4096 x 16-bit RAM to which it is connected via a databus by means of the memory select unit. The microcode specifying the calculations is obtained from the same RAM bank. The actual microcode word, describing an inner product is stored in the microcode register and used as input to the address generator.

Besides the addresses of the vector elements, the address generator also provides the addresses where the 32-bit product results are to be stored. A special stop code in the microcode program indicates when the last inner product has been calculated.

The control/status register specifies the communication between the Q-bus and VIPER-II. It contains a start bit initiating the calculations in the selected memory bank, a data ready bit indicating that the products are all calculated, a memory select bit and an interrupt enable bit.

The interface provides the signals necessary for the operations of VIPER-II as a Q-bus module. It controls the data exchange with both memory banks and the control/status register. The interface also contains the interrupt generation logic.

The clock control provides the signals for the synchronisation and timing of the various hardware components.

5 Applications

VIPER-II has worked satisfactorily as a parallel processor in a PDP-11/23 computer system for over a year. It has been found to be very useful in real-time signal processing and real-time control applications, especially when impulse responses or templates have to be adaptive. But its applicability is not restricted to these functions. Other possible applications are: digital filtering, recursive or non-recursive as for example linear phase filters, adaptive filters and matched filters: continuous frequency analysis of
sampled signals, calculations of signal power; cross- and auto correlation functions; statistical manipulations; matrix operations; etc.

References

Analog Devices. ADSP 1010, 16 × 16 bit CMOS multiplier/accumulator. Datasheet from Analog Devices, Norwood, Massachusetts, USA.


1 Introduction

Digital signal processing techniques have found widespread applications in the biomedical field. In particular digital filtering has been commonly used in many areas of biomedicine such as cardiology (PAN and TOMPKINS, 1985), neurology (FRIDMAN et al., 1982), otology (ENGELKEN et al., 1982), neurophysiology (WHEELER and VALESANO, 1985). Digital filters have been implemented either in hardware (SCHLUTER, 1981) or as software routines for a general-purpose computer (CERUTTI et al., 1985). Hardware implementations are usually intended for real-time operation.

In biomedical applications a special emphasis is given to finite duration impulse response (FIR) filters. They are preferred over the infinite impulse response filters (IIR) because they can be designed to have a linear phase response. This property is specially important in the filtering of the pulse-like signals which are commonplace in biomedicine. However, many linear phase FIR filters reported in the literature are not true linear phase filters because their phase responses exhibit jumps between linear segments. When phase discontinuities occur in FIR filters with a small number of coefficients, distortions in the output waveform may occur. These short duration FIR filters are being used in real-time biomedical signal processing systems such as cardiac arrhythmia monitors (SCHLUTER, 1981; PAN and TOMPKINS, 1985).

This note shows how and when linear phase FIR filters may introduce phase distortions. Two popular FIR filter design methods are extended to cover true linear phase filters. Some examples compare the characteristics and the behaviour of linear phase and true linear phase filters.

2 Linear phase and true linear phase FIR filters

There are two kinds of linear phase FIR filters: those with an even-symmetric impulse response and those with an odd-symmetric impulse response. The first kind is the most important in biomedical applications and is well suited for the design of low-pass, high-pass, bandpass and bandstop filters. The second kind is characterised by a \( \pi/2 \) term in the phase response and is therefore more suited to the design of differentiators and Hilbert transformers.

The unit sample response or impulse response of a FIR filter will be denoted \( h(L), h(L + 1), \ldots, h(L + N - 1) \), where \( L \) and \( N \) are integers and \( h(\cdot) \) is real. \( N \) is the duration of the impulse response. Two cases will be discussed: \( L = 0 \) for a causal system and \( L = -(N - 1)/2 \) (\( N \) odd) or \( L = -N/2 \) (\( N \) even) for a noncausal system.