The NS32000, sometimes known as the 32k, is a series of microprocessors produced by National Semiconductor. The first member of the family, the 32016, came to market in 1982, making it the first 32-bit general-purpose microprocessor on the market. However, the 32016 contained a large number of bugs and often could not be run at its rated speed. These problems, and the presence of the similar Motorola 68000, led to almost no use in the market. Several improved versions followed, including 1985's 32032 which was essentially a bug-fixed 32016 with an external 32-bit data bus possible due to improvements in chip carriers that were becoming common in the market. However, it offered only about 50% better speed than the 32016, and was outperformed by the 32-bit Motorola 68020, released a year prior. The 32532, released in 1987, outperformed the competing Motorola 68030 by almost two times, but by this time most interest in microprocessors had turned to RISC platforms and this otherwise excellent design saw almost no use as well. National was working on further improvements in the 32732, but eventually gave up attempting to compete in the central processing unit (CPU) space. Instead, the basic 32000 architecture was combined with several support systems and relaunched as the Swordfish microcontroller. This had some success in the market before it was replaced by the CompactRISC architecture in mid-1990s.
The NS32000 series traces its history to an effort by National Semiconductor to produce a single-chip implementation of the VAX-11 architecture. The VAX is well known for its highly "orthogonal" instruction set architecture (ISA), in which any instruction can be applied to any data. For instance, an
ADD instruction might add the contents of two processor registers, or one register against a value in memory, two values in memory, or use the register as an offset against an address. This flexibility was considered the paragon of design in the era of complex instruction set computers (CISC).
National took DEC to court in California to ensure the legality of the design, but when DEC had the lawsuit moved to Massachusetts, DEC's home state, the lawsuit was dropped and the Series 32000 architecture was developed instead. Although the new instruction set architecture was not VAX-11 compatible, it did retain its highly "orthogonal" design philosophy.
The processors have 8 general-purpose 32-bit registers, plus a series of special-purpose registers:
(Additional system registers not listed).
The instruction set is very much in the CISC model, with 2-operand instructions, memory-to-memory operations, flexible addressing modes, and variable-length byte-aligned instruction encoding. Addressing modes can involve up to two displacements and two memory indirections per operand as well as scaled indexing, making the longest conceivable instruction 23 bytes. The actual number of instructions is much lower than that of contemporary RISC processors.
Unlike some other processors, autoincrement of the base register is not provided; the only exception is a "top of stack" addressing modes that pop sources and push destinations. Uniquely, the size of the displacement is encoded in its most significant bits: 0, 10 and 11 preceded 7-, 14- and 30-bit signed displacements. (Although the processors are otherwise consistently little-endian, displacements in the instruction stream are stored in big-endian order).
General-purpose operands are specified using a 5-bit field. To this can be added an index byte (specifying the index register and 5-bit base address), and up to 2 variable-length displacements per operand.
The first chip in the series was originally referred to as the 16032, but later renamed 32016 to emphasize its 32-bit internals. This contrasts it with its primary competitor in this space, 1979's Motorola 68000 (68k). The 68k used 32-bit instructions and registers, but its arithmetic logic unit (ALU), which controls much of the overall processing task, was only 16-bit. This meant it had to cycle 32-bit data through the ALU twice to complete an operation. In contrast, the NS32000 has a 32-bit ALU, so that 16-bit and 32-bit instructions take the same time to complete.
The 32016 first shipped in 1982 in a 46-pin DIP package. may have been the first 32-bit chip to reach mass production and sale (at least according to National's marketing). Although this post-dates the 68k by about two years, the 68k was not yet being widely used in the market and the 32016 generated significant interest. Unfortunately, the early versions were filled with bugs and could rarely be run at its rated speed. By 1984, after two years, the errata list still contained items specifying uncontrollable conditions that would result in the processor coming to a halt, forcing a reset.
National changed its design methodology to make it possible to get the part into production and a design system based on the language "Z" was co-developed with the University of Tel-Aviv, close to the "NSC" design centre in Herzliya, Israel. The "Z" language is similar to today's Verilog and VHDL, but has a Pascal-like syntax and is optimized for two-phase clock designs. However, by the times the fruit of these efforts were being felt in the design, numerous 68k machines were already on the market, notably the Apple Macintosh, and the 32016 never saw widespread use.
The 32016 has a 16-bit external data bus, a 24-bit external address bus, and a full 32-bit instruction set. It also includes a coprocessor interface, allowing coprocessors such as FPUs and MMUs to be attached as peers to the main processor. The MMU is based on demand paging Virtual Memory, which is the most unusual feature compared to the segmented memory approach used by the competition, and has become the standard for how microprocessors are designed today. The architecture supports an instruction restart mechanism on a page fault, which is much cleaner than the Motorola approach to dump the internal status on a page fault, which has to be read back, before the instruction is continued.
While often compared to the 68k's instruction set, this was rejected by NSC employees; one of the key marketing phrases of the time was "Elegance is Everything", comparing the highly orthogonal Series 32000 to the "kludge". One key difference is Motorola's use of address registers and data registers, with instructions only working on either address or data registers. The Series 32000 has general-purpose registers.
The 32032 was introduced in 1984. It is almost completely compatible with the 32016, but features a 32-bit data bus (although keeping the 24-bit address bus) for somewhat faster performance. There was also a 32008, a 32016 with a data bus cut down to 8-bits wide for low-cost applications. It is philosophically similar to the MC68008, and equally unpopular.
National also produced a series of related support chips like the NS32081 Floating Point Unit (FPU), NS32082 Memory Management Units (MMUs), NS32203 Direct Memory Access (DMA) and NS32202 Interrupt Controllers. With the full set plus memory chips and peripherals, it was feasible to build a 32-bit computer system capable of supporting modern multi-tasking operating systems, something that had previously been possible only on expensive minicomputers and mainframes.
NS16032 CPU. https://handwiki.org/wiki/index.php?curid=1767279
NS16081 FPU. https://handwiki.org/wiki/index.php?curid=1937204
NS32032 CPU. https://handwiki.org/wiki/index.php?curid=1881977
NS32081 FPU. https://handwiki.org/wiki/index.php?curid=1451299
NS32082 MMU. https://handwiki.org/wiki/index.php?curid=1094893
NS32202 Interrupt controller. https://handwiki.org/wiki/index.php?curid=1537702
NS32203 DMA controller. https://handwiki.org/wiki/index.php?curid=1125413
In 1985, National Semi introduced the NS32332, a much-improved version of the 32032. From the datasheet, the enhancements include "the addition of new dedicated addressing hardware (consisting of a high speed ALU, a barrel shifter and an address register), a very efficient increased (20 bytes) instruction prefetch queue, a new system/memory bus interface/protocol, increased efficiency slave processor protocol and finally enhancements of microcode." There was also a new NS32382 MMU, NS32381 FPU and the (very rare) NS32310 interface to a Weitek FPA. The aggregate performance boost of the NS32332 from these enhancements only made it 50 percent faster than the original NS32032, and therefore less than that of the main competitor, the MC68020.
National Semi introduced the NS32532 in early 1987. Running at 20-, 25- & 30-MHz, it was a complete redesign of the internal implementation with a five-stage pipeline, an integrated Cache/MMU and improved memory performance, making it about twice as performant as the competing MC68030 and i80386. At this stage RISC architectures were starting to make inroads, and the main competitors became the now equally dead AM29000 and MC88000, which was considered faster than the NS32532. For floating-point, the NS32532 used the existing NS32381 or the NS32580 interface to a Weitek FPA. The NS32532 was the basis of one of the few fully realized "public domain" hardware projects (that is, resulting in an actual, useful machine running a real operating system, in this case Minix or NetBSD), the PC532.
The semi-mythical NS32732 (sometimes called NS32764), originally envisioned as the high-performance successor to the NS32532. This program never came to the market.
A derivative of the NS32732 called Swordfish was aimed at embedded systems and arrived in about 1990. Swordfish has an integrated floating point unit, timers, DMA controllers and other peripherals not normally available in microprocessors. It has a 64-bit data bus and is internally overclocked from 25 to 50 MHz. The chief architect of the Swordfish is Donald Alpert, who went on to manage the architectural team designing the Pentium. The Pentium internal microarchitecture is similar to the preceding Swordfish.
The focus of Swordfish was high-end Postscript laser printers, and performance was exceptional at the time. Competing solutions could render about one new page per minute, but the Swordfish demo unit would print out sixteen pages per minute, limited only by the laser-engine mechanics. On each page it would print out how much time it was idling, waiting for the engine to complete.
The Swordfish die is huge, and it was eventually decided to drop the project altogether, and the product never went into production. The lessons from the Swordfish were used for the CompactRISC designs. In the beginning, there were both a CompactRISC-32 and a CompactRISC-16, designed using "Z". National never brought a chip to the market with the CompactRISC-32 core. National's Research department worked with the University of Michigan to develop the first synthesizable Verilog Model, and Verilog was used from the CR16C and onwards.
Versions of the older NS32000 line for low-cost products such as the NS32CG16, NS32CG160, NS32FV16, NS32FX161, NS32FX164 and the NS32AM160/1/3, all based on the NS302CG16 were introduced from 1987 and onwards. These processors had some success in the laser printer and fax market, despite intense competition from AMD and Intel RISC chips. Especially the NS32CG16 should be noted. The key difference between this and the NS32C016 is the integration of the expensive TCU (Timing Control Unit) which generates the needed two-phase clock from a crystal, and the removal of the floating point coprocessor support, which freed up microcode space for the useful BitBLT instruction set, which significantly improves the performance in laser printer operations, making this 60,000 transistor chip faster than the 200,000 transistor MC68020. The NS32CG160 is the CG16 with timers and DMA peripherals, while the NS32FV/FX16x chips have extra DSP functionality on top of the CG16 BitBLT core for the Fax/Answering Machine market. They are complemented by the NS32532 based NS32GX32 later. Unlike the previous chips, there was no extra hardware. The NS32GX32 is the NS32532 without the MMU sold at an attractive price for embedded system. In the beginning, this was just a remarked chip. It is unclear if the chip was redesigned for lower-cost production.
Datasheets exist for an NS32132, apparently designed for multiprocessor systems. This is the NS32032 extended with an arbiter. The bus usage of the NS32032 is about 50 percent, owing to its very compact instruction set, or its very slow pipeline as competitors would phrase it. The NS32132 chip allows a pair of CPUs to be connected to the same memory system, without much change of the PCB. Prototype systems were built by Diab Data AB in Sweden, but did not perform as well as the single-CPU MC68020 system designed by the same company.
NS32C016 CPU. https://handwiki.org/wiki/index.php?curid=1605341
NS32381 FPU. https://handwiki.org/wiki/index.php?curid=1785886
NS32382 MMU. https://handwiki.org/wiki/index.php?curid=1478672
NS32532 CPU. https://handwiki.org/wiki/index.php?curid=1715612
In June 2015, Udo Möller released a complete Verilog implementation of an NS32000 processor on OpenCores. Fully software-compatible with an NS32532 CPU with N32381 FPU, it is significantly faster when implemented on an FPGA, both operating at a higher clock rate and using fewer cycles per instruction.