How the Yamaha FM Synth Cassette Interface Works

A technical overview of how the cassette interface used in Yamaha's FM synthesisers encodes, and decodes patch data for external storage.

Prior to the personal computer becoming the centrepiece of home music production, backing up, and storing synthesiser patches was a non-trivial problem. Synth manufacturers experimented with numerous interesting methods over the years, however none of these has aged more poorly than digital cassette storage.

The use of magnetic tape for recording digital data certainly didn't originate with synthesisers, and Yamaha was by no means the first synth manufacturer to use it for external patch storage either. It's unclear where exactly this innovation originated, but by the late 1970s the use of compact cassette tapes for data storage had become all but standard in the home computing world1. It didn't take long for the leading synth manufacturers to catch on, with many of the flagship synths of the 80s featuring cassette interfaces, such as the Oberheim OB-X, Sequential Circuits Prophet 5, and the Roland Jupiter 8.

This article discusses the particular technical implementation of the data cassette interface used in Yamaha's FM synthesisers, and the historical context within which it was developed.


By the mid 1980s, patch storage for Yamaha's FM synths had become serious business. In the wake of the DX7, a new industry had emerged specialising in the creation, and distribution of patches for Yamaha's FM synths. The glory days may be over for the bands featured in voicecrystal.com's artist gallery, however their testimonials speak to the enormous impact talented patch programmers had on the entire music industry. Professional DX7 patch authors would apparently go so far as to embed garbage information in their patch data to identify patch thieves after the fact, such as unique key scaling in unused operators, or invalid ASCII characters in a patch name2 (Cox & Warner, 2017).

The years leading up to the DX7's groundbreaking 1983 release were a period of rapid, exciting innovation. The personal computer was on the verge of going mainstream, and technology was rapidly reshaping the world. The equally groundbreaking Apple IIe had been released only a few months earlier, the now ubiquitous 3.5"3 floppy disk was just beginning to appear in consumer devices, and the Ensoniq Mirage —which would bring affordable disk-based sampling to the masses— was only a year away4.

Despite recent advances, disk drives still remained complicated, expensive devices. Given all the extra analog circuitry required to integrate one, it's not surprising that synth manufacturers opted for cheaper alternatives5. Yamaha had already bet the company on their new LSI voice chip manufacturing, and there likely wasn't any additional elbow room for taking more risks on new technology.

The DX7 featured a cartridge interface for the external storage of patch data. The cartridge itself contained 16kB of EEPROM, which when inserted would be wired directly into the CPU's address space. Released hot on the heels of its more popular sibling, the DX9 would be the first of Yamaha's FM synthesisers to feature a cassette interface6. This interface consisted of three 1/8" cable sockets to connect to the headphone, microphone, and remote sockets of a consumer-grade cassette recorder. Later synths in Yamaha's FM line —such as 1989's TX81Z— would use an 8-pin DIN socket for the cassette interface, coupled with a proprietary cable terminating in three 1/8" jacks. This cable format is shared with the MSX computer cassette interface, which is not surprising given Yamaha's involvement in the development of the MSX architecture.

Encoding Format

The encoding scheme used by the Yamaha FM synthesisers' cassette interface is a variation of the 'Kansas City standard' format, known as CUTS. This format uses 'frequency shift keying' to encode digital data. The CUTS format is shared with MSX computers, and allows for higher speed data transfer than the standard Kansas City Standard format, allowing for a much faster 1200 baud7, as well as the original 300.

The audio signal begins with an extended 2400Hz 'pilot tone', which can be used to calibrate the volume of the input source. What follows is a sequence of data frames, each containing a single byte. Multiple data items (such as multiple patches, performances, etc) can be encoded in a single recording, separated by arbitrary lengths of the pilot tone.

A binary '1' bit (known as a 'mark') is encoded with two 'cycles' at 2400Hz. A '0' (known as a 'space') is encoded as a single cycle at 1200hz. A data packet, occupying roughly 9.1 milliseconds of tape, consists of a leading zero bit indicating the start of a data frame, followed by the actual 8 data bits, LSB first. Two trailing 'ones' indicate the end of the data frame.

The following diagram demonstrates the structure of an encoded data packet, the byte 0b10110101:

Diagram of a byte encoded in KCS format


The technical implementation of the cassette interface in Yamaha's FM synthesisers is surprisingly simple: The raw electrical audio signal played from the cassette can be interpreted as binary data by the CPU's I/O ports, without requiring any analog to digital conversion. Sampling the positive peak of the audio wave period from the I/O port will be perceived by the CPU as a binary one, with a sample of negative voltage perceived as a zero.

Similarly, pulling an output I/O port's signal high, and low in quick succession will be recorded as a sinusoidal oscillation when sampled as audio by a tape recorder. By controlling the frequency of toggling the output voltage in the software, it is possible to create the encoded data output directly from the CPU, without requiring any additional, specialised circuitry.

By periodically sampling the I/O pin connected to the cassette interface's input port, it's possible to determine the frequency of the incoming audio by counting the number of times the polarity of the signal changes within an arbitrary period.

To accomplish this the synth firmware samples the cassette input I/O port at a set interval corresponding to the baud rate. The firmware controls this interval using an arbitrary delay routine, called between each sampling of the input port. Each successive sample is tested against the last using a XOR instruction, incrementing the 'polarity change count' if the values differ. The number of polarity changes within a period will indicate whether the period's frequency was 2400hz indicating a '1' bit, or 1200hz for '0'. From this information it is possible to construct the full byte.

The Yamaha FM synths use an interesting trick to construct the final value using the Hitachi HD6303 architecture's logical rotate instructions: Given that the number of polarity changes counted in a period will either be less than two in the case of a '0' bit, or more than two in the case of a '1' (binary 0b10, or above), if the resulting pulse count is logically rotated twice rightwards the processor's carry flag will be set in the case that the input value is a '1'. With the result of the last bit read stored in the processor's carry flag, the result is then rotated rightwards into the most-significant bit of the result byte. Since each byte is encoded LSB first, after 8 iterations of this routine the final result byte will have been decoded. This same method of constructing the final byte is used in the cassette interface code in the DX9, DX100, and TX81Z firmware, with only minor variations.


  1. The very first IBM Personal Computer, released in 1981, would feature one.
  2. Having disassembled the firmware, and annotated all of the string subroutines, I can confirm that you could safely do this. Invalid ASCII characters in a patch name simply won't be printed. If an ASCII character value outside of the valid 0-0x7F range is encountered, the string copy process will stop harmlessly.
  3. According to legend, its predecessor, the 5.25" floppy disk, was designed this particular size to discourage users from transporting them in their pockets, and subsequently damaging them by bending.
  4. Ensoniq certainly deserves some credit for putting a 3.5" disk drive into their groundbreaking sampler: The Mirage was one of the first mass-produced consumer devices to incorporate this new standard for floppy disks. In fact, the first manufactured Mirage models apparently featured one of the first production 3.5" disk drives: The Shugart SA300.
  5. History would ultimately vindicate this decision. A roaring trade of replacement disk drives for vintage synths exists on Ebay, Reverb, and other trading sites. The various encoding schemes used for these disks have also not stood the test of time. While proprietary cartridges may have contributed their own particular problems, the technology remains as reliable as it ever was.
  6. It's hard to say why Yamaha opted for a cassette interface over their proprietary voice cartridge system. Evidently Yamaha didn't consider either medium (data cassette, versus cartridge) technically superior to the other: Both were used in their subsequent DX/TX series synthesisers. The DX7II, and TX802 —released in 1986, and 1987 respectively— would both feature a cartridge interface, whereas every other would have a cassette interface.
  7. The DX100 service manual helpfully confirms that the baud rate is indeed 1200. Yamaha's service manuals never disappoint: They remain a font of oddly-specific, however generally vague technical trivia.