Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
725 views
in Technique[技术] by (71.8m points)

excel - "Non-contiguous" range specification in formula

[2018-08-01: See new material near bottom of post.]

I just ran across a Q-n-A here that included an example of a function referencing a range using a specification I had not seen before. I played around with it a little and discovered that most Excel functions will return a value when using this kind of reference:

=SUM(A1:A5:C1:C5:E1:E2:F3:F4)

Notice all the ":"s in there.

With the cursor in the formula bar, those non-contiguous cells are highlighted, as if only those cells would be totaled. However, when I experimented, I discovered that it is equivalent to this formula:

=SUM(A1:F5)

However, this formula does what I expect:

=SUM(A1:A5,C1:C5,E1:E2,F3:F4)

Example: I have the range A1:F5 defined as:

--  A   B   C   D   E   F

1   1   2   3   4   5   6

2   2   4   6   8   10  12

3   3   6   9   12  15  18

4   4   8   12  16  20  24

5   5   10  15  20  25  30 

The formula results are:

=SUM(A1:F5) is 315

=SUM(A1:A5:C1:C5:E1:E2:F3:F4) is 315

=SUM(A1:A5,C1:C5,E1:E2,F3:F4) is 117

This illustrates my point that the first two seem to be equivalent, and the third "correct" way of defining a discontiguous range gives a different result (but what I would expect).

I looked around here and elsewhere on the Internet and found no relevant discussion.

So, my questions:

  1. Is that odd range reference of any real use?
  2. is it really a valid sort of range reference?

[New material 2018-08-01]

@YowE3K wondered what would happen with an odd number of cells. Here is my example:

--  A   B   C  

1   1   1   1   

2   1   1   1   

3   1   1   1   

4   1   1   1   

Now, in a faraway cell, I enter this formula:

=SUM(A1:B3:$C$2)

Then if I drag the formula around, it creates interesting results, basically the 3-row, 2-column range extended to a rectangular range that always includes the fixed cell. The smallest sum you can ever get in this example is 3. Try it and see.

This shows a glimmer of something useful, though I haven't come up with a concrete use for it yet.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
  1. When separated by a comma, simpler is better. Avoid referencing subsets within the same rectangular block that Excel would create if you only selected the upper left and bottom right of the region in question.

  2. No, there are not too many practical exercises that warrant this additional complexity (when referencing cells separated by a colon only). I read a comment/proposed response by someone who suggested there is use, depending on the data structure. I advise that individual seek a better format/ layout of data before using an overly complex referencing regime/strategy

  3. RE: separation by comma: Yes, there are numerous practical and valid exercises that are being managed in suitable fashion

Background / detail

You have asked two questions: i.e. pertaining to 1) validity of 'odd referencing' and 2) potential use (paraphrased).

Re: validity - going back to basics, when you sum the whole are you are simply referencing the upper most left and bottom most right cells - Excel automatically creates a 'perimeter' to encompass any and all cells separated by a colon. This perimeter is always constructed to form a rectangle (the square being a special type of rectangle). You can actually 'force' Excel to keep these cells separated initially, i.e. the first and second summations are identical (parametrically, and in terms of their result):

Figure 1) Usual summation of a rectangular enclosed block

Figure 2) Identical variation depicting upper left and bottom right cells

Probably the most important implication/consequence of this is as follows:

  • You cannot double-sum no matter how many 'sub-ranges', disparate / overlapping or otherwise, PROVIDED these are separated by a colon. For instance, the following gives exactly the same solution as the first 2 figures:

Figure 3) Union of cells within outer perimeter

However, for most practical applications, it would be rare and odd to require multi-subset referencing (especially when it comes to summation). I could envisage something like this when using various subsets that are separated with a comma.

Generally, parsimony/simplicity in formulation is required (for auditability, reviewability, validation, computationally speed, error mitigation, and so on).

I believe it would be quite challenging to concoct / fathom a scenario where referencing several subsets within (or across) tables of data served a cause that couldn't be achieved using the simple, straightforward referencing per Figure 1).

In fact, it would require a highly bespoke exercise/structure to justify something like the summation shown in Figure 3, and as a long-serving manager who reviews excel models almost on a daily basis, I would certainly advise against constructing a summation that referenced in this manner.

The more that I consider, it, multiple referencing of this manner impedes functionality. For instance, using the referencing style in Figures 1-2 allows one to take summation across multiple (adjacent) sheets (which I'm not a big fan of because any sheet that's placed between Sheets 4 and 2 will be included in the summation, which can lead to errors):

=SUM(Sheet4:Sheet2!A1:F5)

Where Sheets 2, 3, and 4 contain identical values to those you provided, which yields 945 (=315 x 3) as one would expect. Doing this for version 3 results in #Value! (not even Excel likes it!).

Now, in terms of separation by commas, this is an entirely different thing but involves similar 'mechanics' under the hood. Excel will only 'perimeter-fence' cells separated by a colon, and it will create several 'ring-fenced' rectangular perimeters, one for each group of (colon-separated) cells that are separated by a comma.

This 'setup' has a number of practical uses: it allows non-contiguous groups of cells to be summed; different conditions can be placed on each e.g. here's a function that sums all even numbers on the left hand side, and all odd numbers on the right:

=SUM(A1:C5*(MOD(A1:C5,2)=0),D1:F5*(MOD(D1:F5,2)<>0))

This yields 99 (if I haven't mixed up the mod divisor/operand:). PS - at this point I realised I copied your values incorrectly and inadvertently included the row labels as a column. So the 99 is based upon the 'correct' values which reconciles to results you've shown.

Of course, I could take the sum of the left & right side (without any such conditions, or with some condition that will also be satisfied, e.g. mod(a,b) < 0), where each half is separated by a comma. Of course this would yield the exact same overall result as summing the entire region like shown in Figures 1-3.

So, when separated by a comma, Excel treats this exactly the same as Figure 1-3 BUT it only does so in respect of each 'subset' of 'colon' separated cells. Here, duplication/overlapping or omitted ranges do count / make a difference.

There is no 'odd' form of referencing when it comes to colon or comma, there is good/best practice, but with the infinite different possibilities/data table constructs, initiatives and uses/models etc/ there will undoubtedly be some use for complicated forms of straightforward summations (when separated by a colon); I see these as being rare/far-between and advise against a complicated set up if it can be avoided.

However, I see greater application / purpose for a similar construct with 2/3 or more subsets separated by a comma. In fact, this effectively achieves the same outcome as the status bar message (with summation statistics selected to 'display', i.e. after right-clicking and selecting it):

Figure 4


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...