|  | Date: Sun, 19 Nov 2000 16:23:57 -0600 (CST) | 
|  | From: Chris Lattner <sabre@nondot.org> | 
|  | To: Vikram Adve <vadve@cs.uiuc.edu> | 
|  | Subject: Re: a few thoughts | 
|  |  | 
|  | Okay... here are a few of my thoughts on this (it's good to know that we | 
|  | think so alike!): | 
|  |  | 
|  | > 1. We need to be clear on our goals for the VM.  Do we want to emphasize | 
|  | >    portability and safety like the Java VM?  Or shall we focus on the | 
|  | >    architecture interface first (i.e., consider the code generation and | 
|  | >    processor issues), since the architecture interface question is also | 
|  | >    important for portable Java-type VMs? | 
|  |  | 
|  | I forsee the architecture looking kinda like this: (which is completely | 
|  | subject to change) | 
|  |  | 
|  | 1. The VM code is NOT guaranteed safe in a java sense.  Doing so makes it | 
|  | basically impossible to support C like languages.  Besides that, | 
|  | certifying a register based language as safe at run time would be a | 
|  | pretty expensive operation to have to do.  Additionally, we would like | 
|  | to be able to statically eliminate many bounds checks in Java | 
|  | programs... for example. | 
|  |  | 
|  | 2. Instead, we can do the following (eventually): | 
|  | * Java bytecode is used as our "safe" representation (to avoid | 
|  | reinventing something that we don't add much value to).  When the | 
|  | user chooses to execute Java bytecodes directly (ie, not | 
|  | precompiled) the runtime compiler can do some very simple | 
|  | transformations (JIT style) to convert it into valid input for our | 
|  | VM.  Performance is not wonderful, but it works right. | 
|  | * The file is scheduled to be compiled (rigorously) at a later | 
|  | time.  This could be done by some background process or by a second | 
|  | processor in the system during idle time or something... | 
|  | * To keep things "safe" ie to enforce a sandbox on Java/foreign code, | 
|  | we could sign the generated VM code with a host specific private | 
|  | key.  Then before the code is executed/loaded, we can check to see if | 
|  | the trusted compiler generated the code.  This would be much quicker | 
|  | than having to validate consistency (especially if bounds checks have | 
|  | been removed, for example) | 
|  |  | 
|  | >    This is important because the audiences for these two goals are very | 
|  | >    different.  Architects and many compiler people care much more about | 
|  | >    the second question.  The Java compiler and OS community care much more | 
|  | >    about the first one. | 
|  |  | 
|  | 3. By focusing on a more low level virtual machine, we have much more room | 
|  | for value add.  The nice safe "sandbox" VM can be provided as a layer | 
|  | on top of it.  It also lets us focus on the more interesting compilers | 
|  | related projects. | 
|  |  | 
|  | > 2. Design issues to consider (an initial list that we should continue | 
|  | >    to modify).  Note that I'm not trying to suggest actual solutions here, | 
|  | >    but just various directions we can pursue: | 
|  |  | 
|  | Understood.  :) | 
|  |  | 
|  | >    a. A single-assignment VM, which we've both already been thinking | 
|  | >       about. | 
|  |  | 
|  | Yup, I think that this makes a lot of sense.  I am still intrigued, | 
|  | however, by the prospect of a minimally allocated VM representation... I | 
|  | think that it could have definite advantages for certain applications | 
|  | (think very small machines, like PDAs).  I don't, however, think that our | 
|  | initial implementations should focus on this.  :) | 
|  |  | 
|  | Here are some other auxiliary goals that I think we should consider: | 
|  |  | 
|  | 1. Primary goal: Support a high performance dynamic compilation | 
|  | system.  This means that we have an "ideal" division of labor between | 
|  | the runtime and static compilers.  Of course, the other goals of the | 
|  | system somewhat reduce the importance of this point (f.e. portability | 
|  | reduces performance, but hopefully not much) | 
|  | 2. Portability to different processors.  Since we are most familiar with | 
|  | x86 and solaris, I think that these two are excellent candidates when | 
|  | we get that far... | 
|  | 3. Support for all languages & styles of programming (general purpose | 
|  | VM).  This is the point that disallows java style bytecodes, where all | 
|  | array refs are checked for bounds, etc... | 
|  | 4. Support linking between different language families.  For example, call | 
|  | C functions directly from Java without using the nasty/slow/gross JNI | 
|  | layer.  This involves several subpoints: | 
|  | A. Support for languages that require garbage collectors and integration | 
|  | with languages that don't.  As a base point, we could insist on | 
|  | always using a conservative GC, but implement free as a noop, f.e. | 
|  |  | 
|  | >    b. A strongly-typed VM.  One question is do we need the types to be | 
|  | >       explicitly declared or should they be inferred by the dynamic | 
|  | >       compiler? | 
|  |  | 
|  | B. This is kind of similar to another idea that I have: make OOP | 
|  | constructs (virtual function tables, class heirarchies, etc) explicit | 
|  | in the VM representation.  I believe that the number of additional | 
|  | constructs would be fairly low, but would give us lots of important | 
|  | information... something else that would/could be important is to | 
|  | have exceptions as first class types so that they would be handled in | 
|  | a uniform way for the entire VM... so that C functions can call Java | 
|  | functions for example... | 
|  |  | 
|  | >    c. How do we get more high-level information into the VM while keeping | 
|  | >       to a low-level VM design? | 
|  | >       o  Explicit array references as operands?  An alternative is | 
|  | >          to have just an array type, and let the index computations be | 
|  | >          separate 3-operand instructions. | 
|  |  | 
|  | C. In the model I was thinking of (subject to change of course), we | 
|  | would just have an array type (distinct from the pointer | 
|  | types).  This would allow us to have arbitrarily complex index | 
|  | expressions, while still distinguishing "load" from "Array load", | 
|  | for example.  Perhaps also, switch jump tables would be first class | 
|  | types as well?  This would allow better reasoning about the program. | 
|  |  | 
|  | 5. Support dynamic loading of code from various sources.  Already | 
|  | mentioned above was the example of loading java bytecodes, but we want | 
|  | to support dynamic loading of VM code as well.  This makes the job of | 
|  | the runtime compiler much more interesting:  it can do interprocedural | 
|  | optimizations that the static compiler can't do, because it doesn't | 
|  | have all of the required information (for example, inlining from | 
|  | shared libraries, etc...) | 
|  |  | 
|  | 6. Define a set of generally useful annotations to add to the VM | 
|  | representation.  For example, a function can be analysed to see if it | 
|  | has any sideeffects when run... also, the MOD/REF sets could be | 
|  | calculated, etc... we would have to determine what is reasonable.  This | 
|  | would generally be used to make IP optimizations cheaper for the | 
|  | runtime compiler... | 
|  |  | 
|  | >       o  Explicit instructions to handle aliasing, e.g.s: | 
|  | >            -- an instruction to say "I speculate that these two values are not | 
|  | >               aliased, but check at runtime", like speculative execution in | 
|  | >             EPIC? | 
|  | >          -- or an instruction to check whether two values are aliased and | 
|  | >             execute different code depending on the answer, somewhat like | 
|  | >             predicated code in EPIC | 
|  |  | 
|  | These are also very good points... if this can be determined at compile | 
|  | time.  I think that an epic style of representation (not the instruction | 
|  | packing, just the information presented) could be a very interesting model | 
|  | to use... more later... | 
|  |  | 
|  | >         o  (This one is a difficult but powerful idea.) | 
|  | >          A "thread-id" field on every instruction that allows the static | 
|  | >          compiler to generate a set of parallel threads, and then have | 
|  | >          the runtime compiler and hardware do what they please with it. | 
|  | >          This has very powerful uses, but thread-id on every instruction | 
|  | >          is expensive in terms of instruction size and code size. | 
|  | >          We would need to compactly encode it somehow. | 
|  |  | 
|  | Yes yes yes!  :)  I think it would be *VERY* useful to include this kind | 
|  | of information (which EPIC architectures *implicitly* encode.  The trend | 
|  | that we are seeing supports this greatly: | 
|  |  | 
|  | 1. Commodity processors are getting massive SIMD support: | 
|  | * Intel/Amd MMX/MMX2 | 
|  | * AMD's 3Dnow! | 
|  | * Intel's SSE/SSE2 | 
|  | * Sun's VIS | 
|  | 2. SMP is becoming much more common, especially in the server space. | 
|  | 3. Multiple processors on a die are right around the corner. | 
|  |  | 
|  | If nothing else, not designing this in would severely limit our future | 
|  | expansion of the project... | 
|  |  | 
|  | >          Also, this will require some reading on at least two other | 
|  | >          projects: | 
|  | >               -- Multiscalar architecture from Wisconsin | 
|  | >               -- Simultaneous multithreading architecture from Washington | 
|  | > | 
|  | >       o  Or forget all this and stick to a traditional instruction set? | 
|  |  | 
|  | Heh... :)  Well, from a pure research point of view, it is almost more | 
|  | attactive to go with the most extreme/different ISA possible.  On one axis | 
|  | you get safety and conservatism, and on the other you get degree of | 
|  | influence that the results have.  Of course the problem with pure research | 
|  | is that often times there is no concrete product of the research... :) | 
|  |  | 
|  | > BTW, on an unrelated note, after the meeting yesterday, I did remember | 
|  | > that you had suggested doing instruction scheduling on SSA form instead | 
|  | > of a dependence DAG earlier in the semester.  When we talked about | 
|  | > it yesterday, I didn't remember where the idea had come from but I | 
|  | > remembered later.  Just giving credit where its due... | 
|  |  | 
|  | :) Thanks. | 
|  |  | 
|  | > Perhaps you can save the above as a file under RCS so you and I can | 
|  | > continue to expand on this. | 
|  |  | 
|  | I think it makes sense to do so when we get our ideas more formalized and | 
|  | bounce it back and forth a couple of times... then I'll do a more formal | 
|  | writeup of our goals and ideas.  Obviously our first implementation will | 
|  | not want to do all of the stuff that I pointed out above... be we will | 
|  | want to design the project so that we do not artificially limit ourselves | 
|  | at sometime in the future... | 
|  |  | 
|  | Anyways, let me know what you think about these ideas... and if they sound | 
|  | reasonable... | 
|  |  | 
|  | -Chris | 
|  |  |