Thursday, September 3, 2015

Google Sheets Query involving two groups of data

Question:

The easiest way to explain what I'm trying to do is to give an example of the data, so here's a simple example:

A                B          C            D
Name1        4          Name4     3
Name2        3          Name4     2
Name3        5          Name1     1.4
Name4        3          Name3     6
Name5        1          Name6     8
Name2        3          Name7     10
Name1        4          Name5     2
With that above data, I want to produce a pair of columns, the first that has a list of all the unique entries in A and C, the second having the average of the corresponding values in either B or D for the entry in A or C. The result should look something like:
Name1      3.133333
Name2      3
etc.
Columns A and C are guaranteed to be strings, B and D are guaranteed to be positive numbers. It is also guaranteed that A and C will never contain the same value.
Then I also want to do another query similar, except instead of taking the average of all the corresponding values, I want to count the number of times some specific value occurs with respect to the total. As an example:
A                B          C            D           E
Name1        4          Name4     3           Y
Name2        3          Name4     2           Y
Name3        5          Name1     1.4        N
Name4        3          Name3     6           Y
Name5        1          Name6     8           Y
Name2        3          Name7     10         N
Name1        4          Name5     2           Y
In this case, I want to count the percentage of times Y occurs in E over the total number of times each item in A and C occurs. This time, B and D are no longer important. For example, Name1 occurs 3 times, and has an E value of Y twice, so I want 0.66667. So the result should look like:
Name1    0.66667
Name2    0.5
Name3    0.5
Name4    1

If at any point any of this isn't clear, please ask so I can clarify. Thanks.

Solution:

Screenshot of Spreadsheet:



I have the following formula in cell G1:
=query({A:B;C:D},"Select Col1,avg(Col2) where Col1<>'' group by Col1 label Col1 'Name', avg(Col2) 'Average' format avg(Col2) '0.00'")

and the following formula in cell J1:
=query(arrayformula(iferror(query({{A:A,E:E,E:E};{C:C,E:E,E:E}},"select Col1,count(Col3) where Col1<>'' group by Col1 pivot Col2")*1,query({{A:A,E:E,E:E};{C:C,E:E,E:E}},"select Col1,count(Col3) where Col1<>'' group by Col1 pivot Col2"))),"select Col1,(Col3/(Col2+Col3)) label Col1 'Name', (Col3/(Col2+Col3)) 'Percentage' format (Col3/(Col2+Col3)) '0.00'")

No comments:

Post a Comment