MAJC Discussion

MAJC is VLIW and has 2 bits of "template" information in place of 2 bits from the opcode of the first instruction. This makes it deviate from traditional VLIW and helps decrease code size - it's kind of like the bits in Tera that tell the number of following independent instructions.

MAJC is also data-type agnostic, meaning that there are no separate resources just for a specific data type, ie all registers can handle all data types, the VLIW packet can handle all of the same type instruction, etc. It uses SIMD instructions when the data type doesn't take up the entire register, although SIMD only really helps for streaming data. The disadvantages to data-type agnosticism are: format (integer and floating point types are stored very differently), routing across the register file, bandwidth (split register files might require fewer read/ write ports) and address space (the big one): there are only a few bits available for register numbers, so having 2 different kinds lets you have two times as many registers available with the same number of bits. However, the address space issue may not be as big of an issue for MAJC since, depending on how the local and global registers are split up, the 4 instructions might have 416 registers between them, all needing only 7 bits to access. The question is if that much flexibility is a good thing.

Virtual channels are an important part of MAJC, but the paper doesn't do a good job of explaining them. As far as we could tell, they're used for message passing at the register level. We're not sure what happens on an interrupt or a context switch. You could use queues to communicate between 2 or more processes. MAJC uses something called space-time computing (STC) to help resolve memory dependences. All shared variables are done through the heap, not the stack, and the speculative threads have different versions of variables. This concept is kind of like register renaming, except it's for variables and is controlled through software. STC is best for Java; C/C++ programs can't really do it because pointer analysis is too hard, so you can't easily distinguish between the stack and the heap.

The given virtual channel example is non-speculative.

The paper doesn't talk about branch prediction or shared memory operations. Since it's VLIW, they probably just have static prediction + predication, although they could do dynamic prediction.

MAJC uses a form of release consistency. This allows them to punt on some issues, although the speculation forces them back to program order, which is even stricter than sequential consistency. MAJC's RC is even more relaxed than the classic kind, since they can distinguish between private and shared variables, so they don't have to wait on loads and stores for private data.