------------------------------------------------------------------------------
Comments:
1) The structure of the 4th section of this paper ("Existing co-scheduling
techniques") is not good. I think those techniques should be introduced from
simple ones to complicated ones. That would be convenient for the readers to
understand. It may be better if "Gang Scheduling" is introduced first.

2) The topic of this paper is Co-scheduling on Cluster Systems, but the
first four techniques in 4.1-4.4 are not suitable for cluster systems for
similar reasons, why spend so much time explaining them in detail? (about 2
pages)

3) In section 6.1, the author states that "The two key mechanisms of
Implicit & Dynamic Coscheduling co-exist in both of them and contribute to
the performance of co-scheduling. Now we can acclaim that they should have
comparable performance."
   That is not logically right. For example, the second key factor they
claim is "the priority boost mechanism". This mechanism has different
effects for Implicit and Dynamic coscheduling. Since dynamic coscheduling
uses aggressive approach to change the priority of a process, while implicit
coscheduling takes advantage of the OS feature to passively get higher
priority after wakeup, the effects might be very different, even though the
priorities are changed in both cases (they are changed in different ways,
please notice that).

4) In section 6.2, it might be much better for the readers to understand if
you put some charts or curves into the paper.

5) In section 7, the structure is a little confusing. In 7.1, a way to
improve co-scheduling on cluster is introduced. Then in 7.2, some
introduction and preparation work for 7.3 is done there (explaining what
stride scheduling is, but that algorithm can not be applied to clusters
without any modification). I think you may recompose this section and make
it clearer to the reader: what is for clusters, what is just introduction.

6) This paper mainly involves two projects: "Now" in Berkeley and "HPVM" in
UIUC/UCSD. But it's not clear in this paper which project uses which
algorithm or mechanism. And there's no comparison between these two
projects, although the comparison among different mechanisms makes sense.
------------------------------------------------------------------------------
The overall layout was not very easy to follow.  You review the 
different topics and then wander a bit.  Some sort of better structure
or better introductory paragraphs are needed.  
 
On the other hand, the paper is extremely thorough.  It covers a variety
of schemes, discussing pro's and cons.  It compares the two best schemes,
and even makes recommendations for improvements.  

The discussion of the local scheduler is delayed until way at the end 
of the paper.  Although it is discussed, it seems to be a second point 
that is just thrown in at the last minute.  The paper would have been 
better laid out without it.  It needs to either be integrated into the 
rest of the discussion on co-scheduling techniques or dropped from the
paper altogether.  

You occasionally use terms that are not well-known without defining them,
for instance, processor thrashing.  It's really bad when entire sentences 
get reused in the paper, but the same word has not really been defined. 
------------------------------------------------------------------------------
1.  A large portion of the paper is spent on summarizing the actual
coscheduling algorithms.  It would have been helpful to have descriptions
of the projects for which they were designed (assuming they were designed
for a system and not as a standalone algorithm) to see the context in
which they are supposed to run.  It is possible that the context for which
these algorithms were designed would explain some of the design criteria.
2.  The motivation for improving the coscheduling algorithm is clear, but
the motivation for using the Stride scheduler as a model is not.  Please
explain.
3.  Also, some significant details of the Stride scheduler are unclear.
For example, how would one guarantee a fair initial allocation of tickets?
Or is this expected to level out over time?
4.  Please rehash your suggested improvements for the coscheduler in the
Conclusion.
5.  Please use a spell checker (ispell).
------------------------------------------------------------------------------
Parallel scheduling is not 'unsolved'. There are several methods that
produce realizable schedules which, while perhaps not always optimal, do
provide benefit in terms either of improved processor usage or machine
throughput.

Please define 'processor thrashing'.

Very good point that workstation clusters allow easy parallel programming
without rewriting the entire operating system.

The Solaris scheduler is mentioned several times. Is this a dynamic
coscheduler, a gang scheduler, or other? And how does it interact with the
scheduling techniques presented? Is it assumed that the Solaris scheduler
operates concurrently with these other presented schedulers?

I liked the discussion on Stride Scheduling. The intuitive terms like
"loan" and "borrowing", combined with clear text, make this section a 
pleasure to read.

I think your ideal coscheduling algorithm need not be fair. Your system
may be concerned only with efficiency at the direct expense of fairness.
A batch processing system is an example of this, where the interactive
response is not important, compared with the machine throughput.  Also,
the ideal algorithm may want to account for efficiencies other than
processor usage, like the cost to compute these schedules (not trivial
when we're talking about things like tree- and graph-traversals), or the
cost to transfer data to and from the schuled activities.
------------------------------------------------------------------------------
Comments: 

It is easy to see that a lot of effort has been put in to gather all the
information that you have tried to explain.  However, the organization of the
paper was difficult to follow thereby making it difficult to understand.  This
paper would have been better had you chosen a narrower focus and addressed
that
with more clarity and depth.

1.      The abstract states that co-scheduling techniques discussed would be
more suited to NOW and HPVM but the paper does not elaborate on this.  The
abstract should be concise and a good overview of the paper.
2.      In section 2 you claim that “[os] in network stations can be
modified or
extra layers can be built upon existing os’s in workstations to provide the
[os] for the whole cluster.”  It would be nice to see an argument justifying
this claim or a reference to where this argument was made.
3.      The structure of sections defined in the introduction was not followed
consistently (e.g., section 5). Also, the logical structure of the subsections
did not fall into the category identified by the section title except for
section 3 and 4.
4.      The categories of explicit and dynamic co-scheduling are not intuitive
and should have been better explained.  It was not clear what property
categorized them as such.  (In my mind, implicit/explicit or static/dynamic
are
intuitive categories, but not implicit/dynamic). The fact that two of the
scheduling algorithms explained in section 4 that were called Implicit
Coscheduling (4.5) and Dynamic Coscheduling (4.6) led to further confusion;
additionally, dynamic coscheduling existed as both an algorithm and as a
category of algorithms.
5.      Section 3.2 identifies efficiency, flexibility, non-intrusiveness,
dynamism, and fairness as properties of coscheduling algorithms.  Other
factors
such as overhead, complexity, nature (distributed v/s centralized), etc.
should
have been mentioned.  
6.      The argument in the analysis of the algorithms in section 4 do not
address these attributes (e.g., 4.3 discusses only cost and fairness).
7.      Define spin-waiting, spin-blocking, fine-grained jobs/applications,
coarse-grained jobs/applications (unless if these are well-known terms; if so,
please forgive my ignorance).
8.      In section 7, you propose 3 coscheduling techniques.  You, however, do
not convincingly support your claims regarding the improvement that these
techniques offer.  An analysis by comparing the attributes of each of these
techniques to the ideal algorithm’s properties (sec 3.2) would give your paper
more closure.  The use of the phrase, “of course, we need many experiments to
claim this, but it seems promising” not only emphasizes the speculative nature
of your argument but also diminishes its overall impact.
9.      Redundancy  repetition of a non-thesis sentence thrice.
10. Spell-check the final draft to catch trivial but costly typos!
------------------------------------------------------------------------------

	Overall, I thought the paper was well written.  I chose the
following scores for the reasons outlined below.  

IMPORT:  
	The impression (or thesis) that I got from the paper was that
co-scheduling is not a new issue.  But co-scheduling as it is applied to
clusters is and has required a new approach, namely Implicit and
Dynamic Co-scheduling.  And since clusters are a hot item nowadays (at least
that's the impression I get), this gets a score of 6.  The 6 is because
I wasn't clear to me that co-scheduling was hot in relation to cluster
issues nowadays. 

NOVELTY:
	The novely rank was based on the authors' choice to propose a
new co-sheduler which combined some of the Dynamic and Implicit
Co-scheduling features.  They choose to use the the message-arrival
trigger for scheduling from Dynamic Co-scheduling and the spin-time
mechanism for deciding when to context switch from Implicit Co-scheduling.
In addition, they proposed using a Stride scheduler to improve fairness
with respect to fine and coarse-grained jobs.  It would have been nice to
hear an explanation as to 1) why the Implicit Co-scheduling fairness
algorithm did not work and 2) why the authors chose to use the Stride
scheduler (as opposed to other proportional schedulers?).  So, I decided
to give the score a 5 because of 1) and 2).

QUALITY:
	As before, the paper was really well written in that the authors
seemed to have a clear understanding of their papers and were able to
convey this.  However, because of the issues discussed in novelty, I am 
assigning a 6.

OVERALL:
	Overall, this was a good paper.  However, I did not see why 3 1/2
pages needed to be spent on co-scheduling techniques for MPPs or SMPs as
these techniques were never mentioned afterwards and from what I
understood, are not crucial to the thrust of the paper. While I found these
techniques interesting, it seems it would have been sufficient to discuss
these explicit co-scheduling techniques within a couple of paragraphs 
(i.e. part of background information).  I would have rather seen the space 
better spent discussing the issues that I mentioned in NOVELTY.  Also, just 
a petty note side note, is it co-scheduling or coscheduling?  It should be 
consistent.  Regardless, I feel this paper earned a score of 6 and should 
be accepted.
------------------------------------------------------------------------------

Import
The work presented by this paper looks very interesting. It
contains research results necessary for the development of
contemporary parallel systems consisting of low cost
workstations. The authors cover all the resent progress in
the area and they categorize the proposed solutions using
correct criteria.

Novelty
The observations made by the authors are novel. Especially,
the improvements proposed by this paper to existing solutions
look really interesting and worth our attention.

Quality
The observations made by the authors for the existing
solutions in their problem are sound. On the other hand,
the solution they propose might be not so well supported,
but some of the topics look really promising.

Overall
This paper should be definitely accepted. It has something to
offer to our efforts in making better operating systems.