------------------------------------------------------------------------------ This paper has a good overview of many messaging schemes. The overall layout of the paper was well done. It would have been better if the internal structures of each section had covered the same topics. For instance, all of the sections talk about architecture, but some bring in additional topics that are specific to only those sections. There are also many grammatical and syntactical mistakes that should have been caught before submission. The actual discussion of the different techniques was rather sparse. Rather than going through the pros and cons (as you said you were doing in the conclusion), you just stated that each of the methods had means that handle different issues. It probably would have been better to discuss where each of these messaging schemes would be most applicable in their respective sections. Some sections were obviously incomplete. For instance, the sentance "A request-reply mechanism is used" is in 3.1. But then there is no follow-up or discussion. ------------------------------------------------------------------------------ I thought the the topic was ineterstng (although it was covered in some extent by the assigned papers). By far, the major area of improvement for this paper should be editorial cleanup. First, it needs to be run through a spell-checker and a grammar checker. Then, someone besides the authors should proof read the draft to make sure that it reads evenly and that sections written by different authors are well connnected. ------------------------------------------------------------------------------ This paper presents and compares several approaches for achieving efficient communication in cluster system. It gives out their design principle and key issues clearly. There are some points that can be made better: 1. Firstly, the title of the paper is not very suitable. AM,FM and U-NET mainly attack how to implement a low latence, high bandwith network protocol. They are at the much lower level than the Process Communications, although processes communication can benefit from these high performance network. 2. Authors compare these projects in some key issues, and show the different design choose of them. But they don't give out clearly why they make such choose and what is the benefits or tradeoff of their choices. For example, why AM and FM use the asynchronization mesage receive( the message handler)? what is its benefit comparing with the traditional send/receive mode. You know, the message handler is not such a nature idea for programmer like send/receive mode. 3. I think they ignore another design key issue of these projects: what kind of service they provide to higher application. For example, the flow control. As far as I know, that FM doesnt provide multiprocess on a node is mainly because it is difficult to give a simple and efficient flow control for multiprocess. It is a tradeoff between the function and efficence. It is also a key problem to other projects. 4. It will be better to provide a overall comparision of these techniques and say something about the performance. After read this paper, I still dont know which one is the best, or most suitable for which kinds of application. ------------------------------------------------------------------------------ Easy to read. Section 2 claims they will break down the analysis into 5 nice distinct areas, however the actual analysis of the projects doesn't seem to be structured that way. ------------------------------------------------------------------------------ A good paper in general. The systems under examination are somewhat well described, but with some ambiguities at some points. 1. The introduction gives the reader to understand what the intent of this paper is. 2. This section sufficiently explains the important issues in process communication. * It is unclear what the sentence " This will provide "zero-copy" on the sender side to re- move the overhead of additional copy, but mech- anisms that protect the pinned-down buffer and cooperate with the existing virtual memory sub- system have to be provided.", means in page 2 . * The paragraph "Small vs. large messages handling Some research on communication suggests that ex- change of small messages is the dominant part in process communication. Therefore, most re- search focus on optimizing their communication system to get good performance for small mes- sages." in page 3 is not sufficiently backed-up: It should be better explained why small message exchange is dominant. * The meaning of the word "indices" is not explained in page 4 * In the paragraph "Architecture: FM differs from a pure message paradigm by not having explicit receiver. In- stead, each message includes the name of a han- dler, which is a user-defined function that is in- voked upon message arrival. FM provides buffer- ing allowing senders to make progress while their corresponding receivers are computing and not servicing the network." in page 5, the message passing is not clearly explained: Since there is no explicit receiver, then how are messages delivered, and to whom? * In the next paragraph: 1. The meaning of "uniform handling of messages with respect to size" is not explained. 2. "FM does not follow a rigid request-reply scheme". The what scheme is employed? 3. It is not clear what is meant by "The FM's send calls do not normally process incoming messages (in con- trast to Active Messages), enabling a program to control when the received data is processed." * The paragraph "To reduce these costs, the designers of FM have given full consideration of how to exploit the hardware features. The basic features are gather/scatter, layer interleaving, and receiver flow control.", in page 6 is not very clear: Do they mean that gather/scatter, layer interleaving, and receiver flow control are implemented in hardware? * In page 7 it is not explained how the security and various restrictions are implemented. The meaning of the sentence "All sender's writes to its automatic-update re- gion are automatically transferred to the remote buffer. dresses." in page 7, is not understood. 4. Discussion covers somewhat adequately the main points under examination. However, suggestions for improvements of the projects are not provided. Some general comments for the paper: 1. There are a lot of spelling and grammar errors, and they make the reading of the paper somewhat difficult in some cases. 2. Some points are not very well explained, and thus the reader must read a couple of times some paragraphs in order to get the exact meaning. ------------------------------------------------------------------------------ Your comments on the paper. This is public comments that the authors of the papers will see. Provide feedback to improve their paper, etc. The paper has good and complete summary of each project. "Issues in communication" gives good vectors (such as kernel involvement, buffer management, etc) to guide the discussion. However, I wish the authors would elaborate on the vectors a little more. Also, the presentation (summary) of each project could follow the vectors presented. The organization of the discussion section is confusing and hard to follow. Much efforts are needed in order to sort through the difference between facts and criticism. ------------------------------------------------------------------------------ This seems like a very important topic that is becoming even more important with supercomputer manufacturers starting to discontinue massively parallel machines. I note that one of the claims you make is that one of the reasons that clusters are great is that they're so expandable. I would've loved to see a discussion on comparing clusters to node-based machines such as the Sun Enterprise 10000, which is modular and expandable. I do like your discussion of where each method of communication is best used. I did get a little lost reading the paper, though, and would've liked to be coddled a bit more with definitions of terms.