Sphinx and the Cursed Mummy Wiki
Advertisement

A binary Sound effect (*.sfx) container is actually three or four different file types that share the same extension, these files are heavily platform-dependent. They are used for level soundbanks, streamed sounds and music tracks.

Generally the audio data itself is ADPCM-encoded, so that it can be decoded by the embedded hardware.

  • IMA ADPCM with stereo channels interleaved: GameCube, PC (always software-decoded).
  • Sony VAG: PlayStation 2.
  • Xbox ADPCM: Xbox.
  • GameCube DSP ADPCM: GameCube (used in the final release; hardware playback, for speed).

All the following files were originally configured and exported using a custom GUI program called EuroSound that, unlike EuroLand, is no longer available, and the program itself is not useful because no source files were preserved.

An open-source replacement is currently being developed to mux and demux soundbanks into YAML and plain *.wav files.

Soundbanks[]

Every level stores all of the used sound effects in its own sound bank, each sound effect has a series of flags and properties and contains a variable array of raw sound samples (PCM in PC, sometimes ADPCM-encoded in some sixth-generation consoles).

Soundbank files[]

These are defined via hashcode in one of the last columns of X:\Sphinx\Grafix\Spreadsheets\LevelData.xls. Then, at runtime, when a level loads its corresponding .EDB files, also searches for that hashcode in X:\Sphinx\Binary\_bin_PC\_Eng\HCXXXXXX.sfx. Unprefixed, formatted in hexadecimal, by masking out the hashcode type section.

There are two exceptions that are hardcoded to try to load at game startup:

  • HC00FFFF.SFX (the single, special stream file; which has a different format).
  • HCFFFFFF.SFX (a special soundbank; the base/common/fallback soundbank?).

Stream file (HC00FFFF.SFX)[]

Most of the long, streamed ambient sounds are actually stored here once instead of being duplicated in each soundbank, each soundbank references an index in the list of streamed sounds, the engine knows it is streamed because unlike most other internal indices, streamed ones are negative and seemingly stored in twos' complement.

It has a format very similar to a streamed music file, and there is only one file that is permanently kept loaded in memory. It's usually located in X:\Sphinx\Binary\_bin_PC\_Eng\HC00FFFF.SFX, but retrieved as ./_bin_PC/_Eng/HC00FFFF.SFX in the remastered port.

Sound details descriptor (SFX_Data.bin)[]

Contains a special binary array, even if most of that data is redundant; but it probably exists to avoid having to load each soundbank just to get properties like the length of a sound. This table is generally kept resident in memory during the entire session. The corresponding data is available as a C-style header in the Sonix folder of the Authoring Tools. See SFX_Data.h.

Confusingly, it's stored in X:\Sphinx\Binary\_bin_PC\music\SFX_Data.bin when it doesn't have anything to do with music, but the EngineX programmers at Eurocom probably chose this because sounds are supposed to vary per language (and hence, loaded from a different folder) so they needed a static but common path to put this. Even if, in practice, only the English folder is actually in use; because there aren't voice overs in the final version, even when multi-language support was planned almost from the start.

Music track files[]

These streamed files are loaded on demand from X:\Sphinx\Binary\_bin_PC\music\HCE?????.SFX. Each music track is defined via hashcode and stored in its own file. A music file usually contains a list of (also hashcode-defined) jump points and start points. Almost every track has a lead-in time, a middle looping section, and —occasionally— a small ending that is not generally used. These files always store ADPCM-encoded sound.

File format data structures[]

Until this point there has been a rough overview of how the whole audio system comes together. From this point onward we will describe and document in detail the inner workings of all these sound-related EngineX files and their internal layouts, so that other third-party tools can interoperate with them, creating their own sounds and music tracks for mods.

For any questions please visit the #mod-dev channel in the official Sphinx Discord server, or ping/mention @Swyter.

Soundbank file structure[]

Soundbank header
Offset Size Description
0h 4 Contains a four-byte string with the magic value MUSX. 4D 55 53 58 in hex.
4h 4 Hashcode for the current soundbank without the section prefix, should mirror the hexadecimal filename.
8h 4 Constant offset to the next section, probably unused. Always 0xC9 for all soundbanks.
Ch 4 Size of the whole file, in bytes. Unused.
10h 4 SFX start; an offset that points to the section where soundbanks are stored, always 0x800 in the original software.
14h 4 SFX length; size of the first section, in bytes. Depends on how many elements are present, so it varies.
18h 4 Sample info start; offset to the second section where the sample properties are stored. Usually goes right after the previous section.
1Ch 4 Sample info length; size of the second section, in bytes.
20h 4 Special sample info start; unused and uses the same sample data offset as dummy for some reason.
24h 4 Special sample info length; unused and set to zero.
28h 4 Sample data start. Offset that points to the beginning of the PCM data, where sound is actually stored.
2Ch 4 Sample data length. Size of the block, in bytes.
SFX elements, relative to the SFX start block (i.e. 0x800)
0h 4 SFX entry count in this soundbank.
4h 8 Linear array of sorted SFX headers laid out in this format, for fast binary search at runtime:
SFX header
0h 4 SFX hashcode. Writes the hashcode without the section. e.g. 0x1A000001 will be 0x1.
4h 4 Offset to said SFX entry.

In turn, each SFX entry —or parameter block— is made out of the following fields:

SFX parameter entry
0h 2 Ducker length. Duration to apply the duck for, in hundreds of a second, seemingly, internally gets turned into milliseconds. See the Ducker field below to control the target ducking volume.
2h 2 Min delay.
4h 2 Max delay.
6h 2 Inner radius real. The start point where getting close enough to the source will max out the "distance" "volume" and won't be attenuated.
8h 2 Outer radius real. The point where getting far enough from the center of the source will mute it, between the inner and the outer radius there will be a proportional falloff, fading in and out depending on the distance.
Ah 1 Reverb send. From 0 to 100.
Bh 1 Tracking type. Specifies the type of SFX.
SFX tracking types and their numeric constants
0 2D The sound won't be panned in the left-right channels. Inherently stereo.
1 Amb Ambient sound.
2 3D Three-dimensional sound. Inherently mono.
3 3D_Rnd_Pos Same but with a random positional offset.
4 2D_PL2 Dolby Pro Logic 2. A proprietary surround encoding. No idea. Probably unsupported.
Ch 1 Max voices. Probably how many simultaneous instances can be played back during gameplay. From 0 to 10-ish.
Dh 1 Priority. The higher the priority, the higher the probability that the SFX instance won't be killed when there are too many sounds playing at the same time. There's a hardcoded number of maximum playable sounds, and it will start killing the lower-priority ones. From 0 to 100.
Eh 1 Ducker. Volume to duck to, or more like the opposite, I think. From 0 to 100.
Fh 1 Master volume. Shared volume of all the samples contained in this SFX. From 0 to 100.
10h 2 Flags, listed below, in order:
SFX parameter flags
1 << 0 maxReject There's a hard limit of voices playing at the same time, this flag specifies that when the limit has been reached, and this SFX is being setup, we can choose to kill either the oldest (playing) one of the same type, or this one that it was about to play.


So reject me and don't create more instances of my type until there's some headroom and a bunch of previous SFX have been killed/ended.

1 << 1 nextFreeOneToUse Seems unused or unimplemented.
1 << 2 ignoreAge Seems unused or unimplemented.
1 << 3 multiSample If false it will pick and play randomly one of the samples in the list.
1 << 4 randomPick Seems unused or unimplemented. See alternatives below.
1 << 5 shuffled When multiSample is true and polyphonic is false, and this is enabled, it will interpret the list of samples as a (randomly) shuffled sequence.
1 << 6 loop If the SFX has ended it will either loop and restart it back anew or kill it.
1 << 7 polyphonic If multiSample is true, and this is enabled, it plays back all the samples of the list at the same time, give or take. If there's a random delay that will affect the start time.


Otherwise, if this is disabled, it will interpret the list as a sequence of multiple samples. If shuffled is false, in order.

1 << 8 underWater If set to false, the volume will be halved when the camera goes underwater.
1 << 9 pauseInNis Platform-specific stuff for the original Nintendo console, I think. Seems unused.
1 << 10 hasSubSfx If true, the file reference of the first "sample" entry gets interpreted as a SFX hashcode and played as a 3D sound at the current position.
1 << 11 stealOnLouder Replace similar sounds of the same type when this one is louder. Works as an alternative to maxReject in similar circumstances. Instead of killing the oldest one we kill and combine both volumes to get a similar sound envelope without losing too much sonority.
1 << 12 treatLikeMusic This sound uses the master music volume, think of the nomad or trading outpost jingles. If you disable music you can't hear them, even if they are ambient sounds.
12h 2 Sample count.
14h 12 Linear array of sample pool files used by this sound effect. Originally probably actual separate files, here as an embedded structure:
Sample pool element
0h 2 File reference. As explained above, it works as an index into the sample info table. When the index is negative, it means that the sample is streamed and the data must be looked up in the stream file instead, where it's also indexed, but that file uses its own special format which is a mix between a music and a soundbank file. The former is not documented yet, but scroll down for the stream file.
2h 2 Pitch offset. Signed.
4h 2 Random pitch offset. Signed.
6h 1 Base volume. Goes from 0 to 100.
7h 1 Random volume offset. Signed.
8h 1 Pan.
9h 1 Random pan.
Ah 2 Alignment padding. Empty, garbage.


Sample info elements, relative to their own section
0h 4 Sample info count in this soundbank.
4h 40 Linear array of sample header elements, each element is made out of the following fields:
Sample header data
0h 4 Flags. Either the one below or set to zero, I think.
Sample header flags
1 << 0 looping Indicates if the sample loops or not during playback. Makes the sound subsystem use the Loop offset field at runtime.

4h 4 Address. Relative offset in the sample data section. At runtime this pointer gets modified and made absolute.
8h 4 Size of the raw PCM data in bytes, with any trailing padding/alignment in mind. Used for DMA'ing. Size in memory (aligned to n).
Ch 4 Frequency. 22050 Hz on PC. Used for normal playback; varies per format.
10h 4 Real size, containing the actual sample data, without any extra padding. Always equal or less than the (aligned) Size field.
14h 4 Number of channels. 2D sounds are stereo (2 channels), while 3D/panned sounds are mono (1 channel).

Streamed sounds and normal 3D sounds seem to be mono, streamed music tends to be stereo, I think.

18h 4 Bits per sample. Either 8 or 16, the default seems to be 16-bit.
1Ch 4 PSI sample header. Unused on PC, on GameCube is the relative offset to the matching special sample info block, see below. Because the data of the various sample arrays is usually listed in the same order you can also divide by 0x60 to get the original index, which should match this one. Internally it is called the «GameCube standard header info».
20h 4 Loop offset. In samples. Can be zero if the sample doesn't loop at all. Offset of loop start.
24h 4 Duration. Rounded, in milliseconds, I think.

(Note: On GameCube this field is mistakenly little-endian due to a bug in the initial exporter and needs to be flipped at runtime.)

Sample data, relative to the (guess what) sample data section offset
0h ~ All the raw PCM data goes in this section, sequentially, as a blob, with some alignment guarantees, but nothing hardcoded. The engine accesses each of them by using the sample info address field.
Special sample info block; relative to the special sample info section offset

ADPCM metadata and parameters for the GameCube DSP (hardware) decoder, missing/unneeded in other platforms

0h 96 To get the number of elements in this section, divide the total size of the block by one element worth of bytes (0x60). The engine loads this section separately and accesses each entry through the PSI sample header offset of the sample info structure above. This section doesn't exist on anything other than GameCube.
Sample header data
0h 4 Number of samples. The total number of RAW samples.
4h 4 Number of ADPCM nibbles. This includes the frame headers.

A nibble is 4-bit, or half a byte, which is the size of one ADPCM data element, so two of them fit into a single byte.

8h 4 Sample rate. The frequency of the sample, in Hertz / Hz. Same as in other places.
(DSP addressing and decode context)
Ch 2 Loop flag. More like a boolean, one for looped, and zero for a not looping sample.
Eh 2 Format, always zero for ADPCM, which is the only thing which uses this thing. :-)
10h 4 Start offset address for looped samples, zero for non-looped.
14h 4 End offset address for looped samples.
18h 4 Always zero.
1Ch 32 Decode coefficients, eight pairs of 16-bit words. 16 USHORTs.
(DSP decoder initial state)
3Ch 2 Gain, or volume. Always zero for ADPCM.
3Eh 2 Predictor/scale.
40h 2 Sample history.
42h 2 Sample history.
(DSP decoder loop context)
44h 2 Predictor/scale for loop context.
46h 2 Sample history (n-1) for loop context.
48h 2 Sample history (n-2) for loop context.
4Ah 20 10 reserved USHORTs of padding. Zeroed out.
5Eh 2 Implicit struct padding, for alignment.

One soundbank stores various sound effects (SFXs) that are used for the matching level, one SFX can use one or more sound samples (samples), the data of the samples themselves can be shared between various SFXs, each SFX properties descriptor has one or more sample pool elements with custom playback properties, each of them has an index that points to the sample info array (or to the stream file if negative) but the pool properties are a unique way of modulating those shared PCM samples without having to have two versions of the same data, through pitch correction, changing volume or other means.

Stream file structure[]

Soundbank header
Offset Size Description
0h 4 Contains a four-byte string with the magic value MUSX. 4D 55 53 58 in hex.
4h 4 Hashcode for the current soundbank without the section prefix, should mirror the hexadecimal filename. Always 0xFFFF. See StreamFileHashCode in SFX_Defines.h.
8h 4 Constant offset to the next section, probably unused. Always 0xC9 for all soundbanks.
Ch 4 Size of the whole file, in bytes. Unused.
10h 4 File start 1; an offset that points to the stream look-up file details. Set to 0x800 in the original software.
14h 4 File length 1; size of the first section, in bytes.
18h 4 File start 2; offset to the second section with the sample data. Set to 0x1000 in the original software.
1Ch 4 File length 2; size of the second section, in bytes.
20h 4 File start 3; unused offset. Set to zero.
24h 4 File length 3; unused. Set to zero.
Stream look-up table, relative to the file start 1 block (i.e. 0x800)
0h ~ An array of offsets, linear, sorted, no hashcodes here, the index is the negative-turned-positive file ref in other soundbank files' sample pools:
Stream look-up file details
0h 4 Address. Relative offset in the stream file details section.

The number of streams to load is found (manually) by dividing the section size in the header, see file length 1 above, by the size of one stream look-up detail element (4h). No fancy info count here.

Stream file details, relative to the file start 2 block (i.e. 0x1000)
0h 4 Marker size. These recursive structures that follow are variable, this covers the whole structure for the stream.
4h 4 Audio offset. Where the raw ADPCM data is located. Relative to the current section. Generally 0x1000 bytes after the current structure, they go inline. Unlike other soundbank elements that are sorted and separated by type.
8h 4 Audio size, in bytes.
Ch 14 Stream marker header data.
Stream marker header data
0h 4 Start marker count.
4h 4 Marker count.
8h 4 Start marker offset.
Ch 4 Marker offset.
10h 4 Base volume. From 0 to 100, usually 100.
20h 34 A linear array containing the music marker start data. Each entry is laid out like this:
Stream marker start data
0h 32 Marker, for the full structure see below.
20h 4 Marker position. This is more like the embedded marker's index into the stream marker data table that comes after this, as a way to "link" both identical structures.
24h 4 Is instant. Boolean. Seems unused, set to zero.
28h 4 Instant buffer. Don't ask me, sounds like another boolean or some kind of pointer. Unused or unimplemented at this time.
2Ch 8 State. Two decoded sound samples. In-memory structure used to decompress ADPCM, due to how the differential audio compression works we need to keep the previous samples somewhere to use them as base, or start point, for the next one. Zeroed out by default when starting from the beginning, otherwise the previous sample data needs to be filled out here. There are two of them because there are two independent/interleaved channels when using Stereo.


Why do we need this? Because otherwise we would need to fast-forward from the beginning to reconstruct the correct sample until that point; each sample adds or subtracts over the previous one. There's no simple way of getting the value, especially when we want to jump to random points in the middle of it.

ADPCM runtime state buffer
0h 4 State 0
4h 4 State 1

Each of the 32-bit state buffers contains the following fields in packed form:

ADPCM runtime state buffer
16 bits Predicted value Previously predicted PCM value, it can also be considered the previously-decoded IMA sample. Otherwise starts at zero.
8 bits Input buffer Staging buffer that contains the last retrieved byte from the stream buffer, only used if Buffer step is true.

One (8-bit) byte contains two 4-bit IMA samples. So it only grabs new data half of the time a sample is decoded.

The decoder goes one nibble at a time.

7 bits Index Previous step change index. As decoded from the delta in the index table. See any IMA decode function for context.
1 bit Buffer step One-bit boolean that controls if we are decoding in the middle of a 4-bit-per-sample IMA byte. True is 1, False is 0.
  1. If false, the decoder will grab a new byte from the stream buffer, it will be copied in the Input buffer, and it will decode the leftmost nibble (4 bits) from it, setting the Buffer step to true.
  2. In the next loop iteration the decoder grabs the same byte from the Input buffer, but it will decode the rightmost nibble and it will set Buffer step to False.
  3. In the next loop iteration we start the same thing again, grabbing another byte.

If the decoded buffer chunk ends in the middle of a byte then the state will exit from the decode function with Input buffer set to True.

So this is why these two fields are only important if we begin playback from the middle of a byte. Generally this isn't the case.

It can also be expressed like this for little-endian platforms, not the case of GameCube:
(((valpred     & 0xffff) << 16) |  /* 0xPPPP.... */ /* swy: signed 16-bit predictor */
 ((inputbuffer &   0xff) <<  8) |  /* 0x....II.. */ /* swy: full byte that contains a staging buffer, because each sample uses 4 bits at a time, so we load a byte, use the top nibble first and decode it, then the bottom nibble and we decode it, then we grab another byte and we start again, keep in mind that stereo channels are interleaved */
 ((bufferstep  &    0x1) <<  7) |  /* 0x......OO */ /* swy:     boolean, top bit of the last byte */
 ((index       &   0x7f) <<  0))   /* 0x......OO */ /* swy: remaining seven bits of the last byte */
Or actually, those fields are ordered in reverse when keeping in mind bit struct layouts, use this instead to output raw 32-bit integers. We want 0xOOIIPPPP when printed as hexadecimal:
(((valpred & 0xffff) <<  0) | ((inputbuffer & 0xff) << 16) | (((bufferstep & 0x1) << 7) | ((index & 0x7f) << 0)) << 24)
~ 32 A linear array of stream marker data elements. Notice how each start data structure above contains/embeds their own copy of the stream marker that will come afterwards. In general, every streamed sample contains at least two music markers; a Start and an End one. Each entry is laid out like this:
Stream marker data
0h 4 Name. This is like some sort of identifier. Points to an index/element in the start data table above.
4h 4 Position. More like data pointer. This is the amount of samples in the audio buffer that we need to skip to get there. The first start point is usually zero, i.e. play right from the start.
  • Keep in mind that the decoder actually rounds the position down for 256-byte block alignment (i.e. do (Pos / 256) * 256), so for an 0xe1618 offset it will use 0xe1700. This is also important to select the right State value. See examples below.
  • Keep in mind that internally the game aligns these offsets to 256 and then divides everything by 4, which is the compression factor for a 4-bit ADPCM sample respect to the final 16-bit signed (short) decoded PCM sample.

So every half byte (a nibble) gets decoded or expanded into a signed number that fits into two bytes. So there is a 1:4 data compression ratio.


So, for example, for music instead of for the stream file, which reuses the sound marker system but the sections are organized differently, let's say that we want to compute the final offset of the first marker's position in HCE0000C.SFX, JMP_Boss3_START, which is 2B23Ch.

0x0002b23c -> hex(int(0x0002b23c / 256) * 256) -> 0x0002b200 [we turn the two smallest hex digits into zero, aligning the address downwards until we reach the previous 'ruler notch'] -> hex(int(0x0002b200 / 4)) -> 0xac80

Then it reads 0x00010000 / 4 = 0x00004000 bytes (the amount of streamed stereo data that it grabs at a time) at the absolute position 0x1000 + 0xac80 = 0xbc80.

The 0x1000 comes from the File start 2 field in the music header. It's the start of the song data section.

For more information, and to see the detective work needed to find a related bug in jmarti856's decoder, see this Discord thread in the official Sphinx server.
8h 4 Music marker type.
Music marker types and their numeric constants
10 Start A music/stream must at least have one start point.
9 End Generally, their name seems to always be -1. They stop playback when reaching this Position.
7 Goto Works just as a Loop marker. From what I see this tag can't work in any other way. Inoperative.
6 Loop Seems like every Loop marker must be accompanied by its own trailing Start marker, both sharing Position and Marker count fields. Check out the examples, they can be used as a quick template.
5 Pause Works just like the End marker, but additionally instructs the playback logic to pause now. Seems stubbed, or unused.
0 Jump When reaching this point we jump to a certain Goto marker. Seemingly not used on streams.
Ch 4 Flags. Generally 2 (i.e. (1 << 1)). From what I see it seems unused.
10h 4 Extra. Nothing comes to mind right now, set to zero for now. ¯\_(ツ)_/¯
14h 4 Loop start. When the marker is of type Loop it seems to mark the initial/start/beginning part of the looping zone where we move the playback head back when reaching this loop zone "end" marker.
18h 4 Marker count. Starts at zero and auto-increments. Except when two markers come together as part of a block. e.g. Loop + Start.
1Ch 4 Loop marker count. Set to 1 when the marker is of type Loop, may be the index of the matching Start point marker. As they tend to share the same Loop start -> Pos offset.

Here's an example of the marker layout for your normal streamed sound with an end point:

  • Start marker data (1 element):
    • Marker pos: 0, State: 00000000h / 00000000h
      • Name: 0, Pos: 0h, Type: Start, Marker count: 0
  • Markers (2 elements):
    • Name: 0, Pos: 0h, Type: Start, Marker count: 0
    • Name: -1, Pos: 5D000h, Type: End, Marker count: 1


Here is an example of the marker layout for an streamed sound with a looped part, we can appreciate here that the marker positions are 0/1/3, and they match the ordering of the same Start markers in the fourth-element table, skipping the Loop (index 2) one:

  • Start marker data (3 elements):
    • Marker pos: 0, State: 00000000h / 00000000h
      • Name: 0, Pos: 0h, Type: Start, Marker count: 0
    • Marker pos: 1, State: 00000000h / 00000000h
      • Name: 1, Pos: 0h, Type: Start, Marker count: 1
    • Marker pos: 3, State: 34B0F8AFh / 34B0F8AFh
      • Name: 2, Pos: 40028h, Type: Start, Marker count: 2
  • Markers (4 elements):
    • Name: 0, Pos: 0h, Type: Start, Marker count: 0
    • Name: 1, Pos: 0h, Type: Start, Marker count: 1
    • Name: 1, Pos: 40028h, Type: Loop, Marker count: 2, Loop marker count: 1, Loop start: 0h
    • Name: 2, Pos: 40028h, Type: Start, Marker count: 2


Another similar example, but that instead of looping back to the beginning, loops to a middle point:

  • Start marker data (3 elements):
    • Marker pos: 0, State: 00000000h / 00000000h
      • Name: 0, Pos: 0h, Type: Start, Marker count: 0
    • Marker pos: 1, State: 0DBBFE07h / 0DBBFE07h
      • Name: 1, Pos: 46000h, Type: Start, Marker count: 1
    • Marker pos: 3, State: 00000000h / 00000000h
      • Name: 2, Pos: 304000h, Type: Start, Marker count: 2
  • Markers (4 elements):
    • Name: 0, Pos: 0h, Type: Start, Marker count: 0
    • Name: 1, Pos: 46000h, Type: Start, Marker count: 1
    • Name: 1, Pos: 304000h, Type: Loop, Marker count: 2, Loop marker count: 1, Loop start: 46000h
    • Name: 2, Pos: 304000h, Type: Start, Marker count: 2


Another looping example:

  • Start marker data (3 elements):
    • Marker pos: 0, State: 00000000h / 00000000h
      • Name: 0, Pos: 0h, Type: Start, Marker count: 0
    • Marker pos: 1, State: 00000000h / 00000000h
      • Name: 1, Pos: 14h, Type: Start, Marker count: 1
    • Marker pos: 3, State: 2F280040h / 2F280040h
      • Name: 2, Pos: 61FF8h, Type: Start, Marker count: 2
  • Markers (4 elements):
    • Name: 0, Pos: 0h, Type: Start, Marker count: 0
    • Name: 1, Pos: 14h, Type: Start, Marker count: 1
    • Name: 1, Pos: 61FF8h, Type: Loop, Marker count: 2, Loop marker count: 1, Loop start: 14h
    • Name: 2, Pos: 61FF8h, Type: Start, Marker count: 2

Music file structure[]

Music files are very similar in spirit to the stream file (in fact, the streamed playback system that handles both is the exact same). But each music track is stored separately, contains a single stream, and it is loaded on the spot.

Music header
Offset Size Description
0h 4 Contains a four-byte string with the magic value MUSX. 4D 55 53 58 in hex.
4h 4 Hashcode for the current music bank without the section prefix, should mirror the hexadecimal filename. Always starts with 0xE_____, the music hashcode subtype.
8h 4 Constant offset to the next section, probably unused. Always 0xC9 for all soundbanks.
Ch 4 Size of the whole file, in bytes. Unused.
10h 4 File start 1; an offset that points to the stream look-up file details. Set to 0x800 in the original software.
14h 4 File length 1; size of the first section, in bytes.
18h 4 File start 2; offset to the second section with the sample data. Set to 0x1000 in the original software.
1Ch 4 File length 2; size of the second section, in bytes.
20h 4 File start 3; unused offset. Set to zero.
24h 4 File length 3; unused. Set to zero.
Music/stream header data/file details, relative to the file start 1 block (i.e. 0x800)
0h 14 Music marker header data.
Music marker header data
See the stream marker header data structure, same thing.
20h 34 A linear array containing the music marker start data. Each entry is laid out like this:
Music marker start data
See the stream marker start data structure, same thing.
~ 32 A linear array of stream marker data elements. Notice how each start data structure above contains/embeds their own copy of the stream marker that will come afterwards.
Music marker data
See the stream marker data structure, same thing.


Unlike streamed sounds, music tracks generally have multiple Start point markers, and a Goto marker for looping, almost at the end. See MFX_Defines.h for context and their actual tags.

For example: MFX_Swim_Danger is 0x1BE00030, which is also equivalent to JMP_Swim_Danger_Start. Both can be used to kick-start a music track from the beginning.


Then comes JMP_Swim_Danger_LOOP (0x1BE00130), JMP_Swim_Danger_Random_Start1 (0x1BE00230), JMP_Swim_Danger_Random_Start2 (0x1BE00330) and JMP_GOTO_Swim_Danger_LOOP (0x1BE00430), which is a marker of type Goto and moves/jumps the playback head back to the JMP_Swim_Danger_LOOP (Start) point. The only special thing is that the Goto marker links back to it, so it's meant for a specific function, and it may be redundant due to how the original Eurosound export tool worked, it probably didn't detect duplicates. But we can call it on our own from the game scripting all the same.

As you can see, the lowest two hex digits encode the music track, and the third one optionally encodes the start point:

0x1BE00330
0x1BE..... <- hashcode of type MFX
0x......30 <- track number
0x.....3.. <- jump/start point marker index, zero for the first one/play from the start
Music audio data, relative to the file start 2 block (i.e. 0x1000)
0h ~ Start offset of the music audio data. See the stream file section for details, same thing.

SBInfo file structure (HCFFFFFF.SFX)[]

The structure of this file is very similar to the music files and stream files. This file seems to be unused by the game code. It basically includes two arrays, one with all soundbank hashcodes stored in the first section, and another one with the musicbank hashcodes stored in the second one.

SBInfo header
Offset Size Description
0h 4 Contains a four-byte string with the magic value MUSX. 4D 55 53 58 in hexadecimal.
4h 4 Hashcode for the current file. Always 0x00FFFFFF.
8h 4 File version.
Ch 4 Size of the whole file, in bytes. Unused.
10h 4 File start 1; an offset that points to the soundbanks hashcodes array. Set to 0x800 in the original software.
14h 4 File length 1; size of the first section, in bytes, it has a maximum size of 600 bytes.
18h 4 File start 2; offset to the second section with the musicbanks hashcodes array. Set to 0x1000 in the original software.
1Ch 4 File length 2; size of the second section, in bytes, it has a maximum size of 600 bytes.
20h 4 File start 3; unused offset. Set to zero.
24h 4 File length 3; unused. Set to zero.
SoundBank hashcodes array, relative to the file start 1 block (i.e. 0x800)
0h 600 An array containing all soundbank hashcodes that the current game has.
MusicBank hashcodes array, relative to the file start 2 block (i.e. 0x1000)
0h 600 An array containing all musicbank hashcodes that the current game has.

Formats and frequencies[]

Streamed (i.e. music) sounds use a hardcoded playback setup depending on the underlying platform and DSP:

Platform Initial music volume Music playback frequency Maximum active voices Streams playback frequency Compression, sample encoding Software decode
PS2 75 32000 Hz 20 22050 Hz Headerless Sony VAG (yes, always ADPCM)
PC 60 32000 Hz 20 22050 Hz IMA ADPCM
Xbox 75 44100 Hz 40 22050 Hz Xbox ADPCM
GameCube 55 32000 Hz 20 22050 Hz IMA ADPCM Early on, uses DSP on final versions

SFX_Data.bin file structure[]

Because the structure data (and their field names) are already publicly available in various text formats as part of the Authoring Tools DLC, I can freely publish a similar 010 Editor binary template with my own comments and notes below, for a more complete understanding:

//------------------------------------------------
//--- 010 Editor v9.0.1 Binary Template
//
//      File: Sphinx SFX_Data.bin files (support array that is stored under _bin_PC/Music for some reason)
//   Authors: Swyter
//   Version: 2018.11.26
//   Purpose: It stores some duplicated data in an optimized/permanent way to avoid the costs of depending on big, loadable soundbanks that might not be available yet.
//            Needed for sound effects to work, as some values are actually retrieved from here instead of their corresponding bank. Probably generated by EuroSound.
//            Why here? In theory the sound folder is per-language and can vary (even if in practice only _bin_PC/_Eng exists) they decided to put this file there.
//  Category: Audio
// File Mask: sfx_data.bin, SFX_Data.bin
//  ID Bytes: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 // swy: first sfxOutputDetails always empty
//   History: Cleaned up and commented on 2019.05.22
//------------------------------------------------

typedef int32  s32;
typedef uint32 u32;

struct sfxOutputDetails
{
    u32   hashcode <format=hex>; /* swy: unique constant value that should have a matching SFX_* tag in SFX_Defines.h */
    float innerRadius;           /* swy: where it reaches full volume; in meters */
    float outerRadius;           /* swy: so far away that it has completely faded out; in meters */
    float alertness;             /* swy: unused in the final version, I think; this was used for the scrapped mummy/guards sneak sections and Metal Gear Solid-like AI */
    float duration;              /* swy: cumulative length of all the samples in seconds, I think? */
    byte  looping;
    byte  tracking3D;            /* swy: special 3D trigger-like sound emitters (how are they different from trackingless '3D'?) */
    byte  sampleStreamed;        /* swy: streamed 'ambient' sound effects made out of samples that are stored in _bin_PC/_Eng/HC00FFFF.SFX; special sound-bank that is always loaded */
    byte  pad;
} thing[FileSize() / sizeof(sfxOutputDetails)];

SFX versions in Eurocom games[]

The EngineX audio subsystem had, at least, five known public revisions of the SFX/MusX file formats. All of them were very similar, the main difference is that newer versions added more fields, especially in the SFX parameters section, and removed some others that were unused in the newer versions of the audio engine, internally called EuroAudio.

Game Release year Platforms SFX version Notes
Buffy the Vampire Slayer: Chaos Bleeds 2003 GameCube, PS2, Xbox, PC 201 This seems to be the first public version, as officially documented above.
Sphinx and the Cursed Mummy
Athens 2004 2004 PS2 1 Uses the same file-structure that the 201 uses.
Spyro: A Hero's Tail 2004 GameCube, PS2, Xbox 4 Header differences:

After the file size field, adds:

  1. Contains a string with the platform (PS2_, PC__, GC__, XB__) tag. FourCC-style.
  2. Timestamp, representing the date and time at export time. Stored as the time in seconds that have passed since 2000-01-01, 1:00:00 UTC (one needs to add 946684800 seconds to the normal Unix epoch, that starts in 1970) This was experimentally found by looking and comparing dozens of sample files against other known date ranges. We're reasonably confident.
  3. Unknown, but when the platform is PC__ or GC__ is set to 1
  4. Padding for alignment (4 bytes)

Soundbanks: SFX parameters differences

  1. Removes two fields
    1. Inner radius
    2. Outer radius
  2. Adds two new fields
    1. Group HashCode
    2. Group max. channels

Soundbanks: Sample info differences

Some unnecessary fields are removed here:

  1. Number of channels
  2. Bits per sample

StreamFile: StartMarker differences

Some fields are removed here:

  1. Flags
  2. Extra
  3. Marker count

StreamFile: Marker differences

Some fields are removed here:

  1. Flags
  2. Extra
  3. Marker count
Robots 2005 GameCube, PS2, Xbox 5 Some fields changed their type. For example, the SFX flags here are are stored as an array of 16 bytes.

Soundbanks: SFX parameters differences

  1. Adds two new fields:
    1. Doppler value
    2. User value
Predator: Concrete Jungle PS2, Xbox
Batman Begins GameCube, PS2, Xbox 6 Soundbanks: SFX parameter differences
  1. Adds two new fields:
    1. SFX ducker
    2. Spare: This field was added to be expanded in future versions, seems to be unused in this version.
Ice Age 2: The Meltdown 2006 GameCube, PS2, Xbox, PC, Wii
Pirates of the Caribbean: At World's End 2007 Xbox 360, PlayStation 3, Wii, PlayStation 2, PSP, PC 10 From this version onward the file format has been completely rewritten.
Advertisement