**Table of contents**

**Load the data**

Like in the previous post, we’ll use only vanilla JavaScript, so you will be able to follow along in the console of your browser.

To load the data into your console, do the following:

` ````
// this is the url that you need to fetch the data
let url = "https://www.sebastianlammers.com/api/data.json?name=penguins"
// this will load the data into your session under the name "data"
let data = await (await fetch(url)).json();
```

If you prefer a REPL, use the starter ↗ with pre-loaded data (or continue where you left off?).

## Count frequency of distinct values

In the previous post, we found out there are only 3 species in our dataset. An obvious next question is: how many of the 344 observations does each of the 3 species account for?

Let’s first find out for just one species to understand what kind of operation is needed for this task and afterwards look at all species.

Out of the 3 species at hand, *Gentoo* sounds like the most fun to me, so we’ll count these first.
Basically, we want to take the **array** and summarize the number of appearances of *Gentoo*. Another way of saying the same is: we want to *reduce* the **array** (many observations) to a single value (number of *Gentoo* penguins).

Drumroll…

To introduce `reduce()`

, I’ll reach to a standard example: computing the sum of *something*.
Let’s make it really simple.

### Reduce to sum

` ````
// example array
let bodyMassGrams = [3750, 3800, 3250];
// declare initial value
let initialValue = 0;
let bodyMassSum = bodyMassGrams.reduce(
(accumulator, currentValue) => accumulator + currentValue, initialValue
);
console.log(bodyMassSum) // logs 10800
```

This code takes the **array** of 3 numbers and adds them up together.
It goes through the **array**, takes the *last value* (`accumulator`

) and adds the *current value* to it.
Because we specify an *initial value*, this will be the first value that `accumulator`

takes (i.e. `initialValue = 0`

).
*Current value* will be 3750, because it is the first entry in our **array**, and the sum of both is also 3750 (`0 + 3750`

).
Moving on to the next step, *now* `accumulator`

is 3750 and *current value* is 3800 and so on and so on.

**Under the hood of reduce()**

Curious to see how *accumulator* and *current value* change while reduce() works through the array?
What do you think happens when you omit an initial value?
Run this code to find out!

` ````
let bodyMassGrams = [3750, 3800, 3250];
// this function will do the summation-part and log all values
function logger(accumulator, currentValue, index) {
const returns = accumulator + currentValue;
console.log(
`index: ${index},
accumulator: ${accumulator},
currentValue: ${currentValue},
returns: ${returns}
==============================`,
);
return returns;
}
// run the reduce-logger to see whats inside
bodyMassGrams.reduce(logger);
```

Notice how reduce starts at index 1! Try changing the last line to `bodyMassGrams.reduce(logger, 0);`

to add an initial value and see what happens.

Back to our data (an **array of objects**, remember?). Let’s count all those *Gentoo* penguins with the help of reduce:

### Count Gentoo

` ````
let initialValue = 0;
let countGentoo = data.reduce((counter, row) => {
if (row.species === "Gentoo") counter += 1
return counter;
}, initialValue);
console.log(countGentoo); // logs 124
```

Voilá! We have 124 *Gentoo* penguins in our dataset.

I deviated from the traditional variable names here, because basically what we do is **counting**. And since we’re iterating through the rows of our table (aka the **objects** inside our **array**), I use *row* to refer to the current **object**.
For each row, inside the `species`

variable, check if that is equal to `"Gentoo"`

.
**If so**, increase our counter accumulator by 1.
Return the counter when the whole **array** is processed.

**Find another way**

If you still remember what you learned about `filter()`

, then you will be able to use that too to find out how many Gentoo are in the data.

I’ll leave that as an exercise to you.

But! If we want to do the same for the other two species, we’d have to repeat that code… Instead, let’s quickly turn our snippet from above into a function that we can re-use.

### A counting function

` ````
function speciesCounter(data, species) {
return data.reduce((counter, row) => {
if (row.species === species) counter += 1
return counter;
}, 0);
};
const countGentooToo = speciesCounter(data, "Gentoo");
console.log(countGentooToo); // logs 124 too
```

Now you can easily do the same for the other 2 species. Have a play with it.

**Bonus: Count all at once**

Do you want to be extra-efficient? Thanks to Leigh Halliday’s tutorial video, I give you a more advanced way to count all species at the same time.

` ````
let countAll = data.reduce((acc, row) => {
return { ...acc, [row.species]: (acc[row.species] || 0) + 1 };
}, {});
console.log(countAll); // logs { Adelie: 152, Gentoo: 124, Chinstrap: 68 }
```

Please check out the video, where Leigh does an amazing job at explaining this super concise code. He does his example with people rather than penguins, but it’s the same logic.

**Why is everybody using 'accumulator'?**

The variable-name `accumulator`

or the shorter version `acc`

is a very common choice for JavaScript’s reduce(). When I first encountered reduce(), I saw lots of examples that always used `(acc, cur)`

and I honestly found it not very intuitive.

The main take-away is this: the whole idea of the reduce() function is to go from many values to a single value. The way to get there is to work with some sort of *last value* and *current value* that can be used to compute stuff with. It so happens that a majority of persons is calling the *last value* the `accumulator`

, because it **persists** and is **updated** while iterating through all values. In contrast, the *current value*, often called `cur`

, is simply updated to the value at hand.

In our example above, `accumulator`

is literally *accumulating* all values by adding them, one by one.

**Further resources on reduce()**

## Sort the data

Another common thing you might want to do with your data is sorting it.
Say you want to make sure the penguins with the lowest body mass are listed first in your data.
How would you go about that?
While there is a native **array** method called `sort()`

, I’m showing this rather late in this post, because I find it to be surprisingly complex.
Here is the code you’d probably write first:

` ````
// get ready for a surprise:
let sorted = data.sort((a, b) => a.body_mass_g - b.body_mass_g);
```

If you inspect the `sorted`

data, you’ll find it is indeed sorted as expected.
Job done.

There’s a big *BUT*.
Try logging this and be surprised:

` ````
// the original is sorted as well!
console.log(sorted === data) // logs "true"
```

If we simply call `data.sort(...)`

we’re *sorting in place*.
That’s fine if you know what you’re doing and you don’t need to preserve the original order in your data.
**But even if you assign your sorted array to a new variable, the original will be sorted as well!**

**If you actually do want to sort in place...**

… you don’t need to assign to a new variable and instead can just call `sort()`

on your data like so:

` ````
data.sort((a, b) => a.body_mass_g - b.body_mass_g);
```

Here is the code you need to create *a new* **array** with the sorted values.

` ````
// this creates a new sorted copy
let sorted = [...data].sort((a, b) => a.body_mass_g - b.body_mass_g);
```

We use a neat little trick to **first copy** the **array** via the spread syntax `[...data]`

and **then sort** *that new copy* in place and assign it to our variable `sorted`

.

**Alternative to the Spread syntax '...'**

Another way to create a copy and then sort the copied array is to use `slice()`

without arguments:

` ````
// this creates a new sorted copy
let sorted = [...data].sort((a, b) => a.body_mass_g - b.body_mass_g);
// same:
let slicedSorted = data.slice().sort((a, b) => a.body_mass_g - b.body_mass_g);
console.log(sorted === slicedSorted) // logs "true"
```

Pick your poison.
In my humble opinion both are ok, but I would find it more intuitive if `sort()`

simply returned a new array (like `map()`

and `filter()`

do.)

### Ascending vs. descending order

The obvious question right now is: why do we need to write this `a - b`

stuff?
`Sort()`

expects a *compare function* that specifies *what* should be sorted and *how* it should be sorted.
It boils down to this:

- If the
*compare function*returns a*positive number*then`a`

is sorted*after*`b`

. - If the
*compare functions*returns a*negative number*then`a`

is sorted*before*`b`

.

Let’s walk through a simplified example to try to understand this:

` ````
// simple array
let bodyMassGrams = [3750, 3800, 3250];
// sorted array
bodyMassGrams.sort((a,b) => a - b)
```

What is happening?
Let’s take this **array** of 3 values and think through what is compared and how that will affect the sorting.
In the table below I list the comparison, the numerical result, and the effect on sorting the elements, step by step.

comparison | result | effect |
---|---|---|

(a: 3750) - (b: 3800) = | -50 | 3750 is sorted before 3800 |

(a: 3800) - (b: 3250) = | 550 | 3800 is sorted after 3250 |

(a: 3750) - (b: 3250) = | 500 | 3750 is sorted after 3250 |

Now all values have been compared to each other and the order has been established!
The first comparison yields a *negative number*, so `a: 3750`

is sorted *before* `b: 3800`

.
The other two comparisons both yield a *positive number*, because `3800`

and `3750`

are both greater than `3250`

…

In case this still does not make sense to you right now I encourage you to watch the video by The Coding Train that I link to below.
Daniel makes an incredible job at explaining and showcasing `sort()`

.

**Further resources on sort() and '...'**

**Hey, what about descending?**

Yep, you got me there. But it’s simple:

` ````
// simple array
let bodyMassGrams = [3750, 3800, 3250];
// sorted array (ascending)
bodyMassGrams.sort((a,b) => a - b)
// sorted array (descending)
bodyMassGrams.sort((a,b) => b - a)
```

Or you use a helper function like this to make it more explicit:

` ````
function compareValues(order = "asc") {
return function innerSort(a, b) {
let comparison = 0;
if (a > b) {
comparison = 1;
} else if (a < b) {
comparison = -1;
}
return (
(order === "desc") ? (comparison * -1) : comparison
);
};
}
// default: ascending
let asc = [...bodyMassGrams].sort(compareValues("asc"));
// descending on demand
let desc = [...bodyMassGrams].sort(compareValues("desc"));
```

## Mutate, select, rename columns

We’re gonna finish this post off with a roundhouse-kick of data manipulations. Adding a new column based on existing ones? Got ya! Renaming existing columns? Easy! Selecting columns to boil down the amount of data? Totally!

And ** all these** can be done with the help of our trusty

`map()`

function!### 1: Mutate to add a new column

Let’s jump right in and re-use the `map()`

-example from above, but this time applying the conversion of grams to kilograms to all 344 observations.

` ````
// convert from grams to kilograms for all entries
let dataWithKG = data.map(row => {
return {
body_mass_kg: row.body_mass_g / 1000
}
});
```

As with the other use of `map()`

above, this let’s us visit each entry in our **array** called `data`

.
Inside each of those entries (*or rows of our table*), we define a new pair of `key:`

and `value`

.
Sticking to the variable-naming convention of this dataset, we call our new colum `body_mass_kg`

and define the values with the simple computation for each of the `row.body_mass_g`

.
Notice that we `return`

an **object** by wrapping our simple computation in curly bois `{}`

.
If you `console.log()`

the result of this operation, you’ll see that we successfully converted all entries in our rows from grams to kilograms.
But you’ll also notice that all the other columns are gone.
That’s a shame isn’t it?
Luckily, it’s easy to preserve them all with another friend: the **spread syntax**.

` ````
// convert from grams to kilograms for all entries
// and preserve existing columns
let dataWithKG = data.map(row => {
return {
...row,
body_mass_kg: row.body_mass_g / 1000
}
});
```

And just like that you have all other variables preserved as well.

**I noticed something about the new column!**

If you logged this one out too and looked at the output closely, you probably noticed that all the old variables are shown as `'strings'`

, while the new column is rendered as a `number`

.
(Depending on your setup/browser etc. the look of it might vary.)

Well, yes! That’s because I’m lazy. * All the data* in the original array of objects are strings.

If you want to clean up my mess, you can use this little trick to convert any of the columns to numerical values too.
You’ll have to specify which columns you want to convert inside `map()`

like this:

` ````
let dataWithNumerics = data.map(row => {
return {
...row,
body_mass_kg: row.body_mass_g / 1000,
// conversions:
body_mass_g: +row.body_mass_g,
bill_length_mm: +row.bill_length_mm,
bill_depth_mm: +row.bill_depth_mm,
flipper_length_mm: +row.flipper_length_mm,
year: +row.year
}
});
```

It looks weird, I know. But this is a very common pattern you’ll see frequently in the wild to convert from strings to numbers.
It uses the unary plus operator, which is *“the fastest and preferred way of converting something into a number, because it does not perform any other operations on the number”*.

### 2: Select columns

Going from many columns to a selection of a few needed columns is a piece of cake now.
Let’s say we want to visualize `body_mass_g`

by `species`

.
Let’s make a new, smaller **array of objects** that only contains those two columns.

` ````
let smolData = data.map(row => {
return {
species: row.species,
body_mass_g: +row.body_mass_g
}
});
```

If we then reduce our values to only retain the highest values per species, we can see who is the largest:

#### Heaviest Penguins by Species

A proper visualization should maybe also show the variation within in the species, instead of reducing everything to the heaviest observation… Let me know if you create a better graph…!

### 3: Rename columns

I think you can guess by now how to rename a column.
Nobody is stopping you from just using a different `key:`

inside your `map()`

call.
So we can just replace `species`

and `body_mass_g`

with whatever floats your boat.

` ````
let smolData = data.map(row => {
return {
type: row.species,
mass: +row.body_mass_g
}
});
```

Keep in mind that special characters should be avoided here.
If you stick to *just letters*, you’re fine.
Technically, you could also go crazy and use something like `"I need some space":`

as your `key:`

, but you’ll have other problems down the road then, like not being able to use the **dot-notation** to access your rows (`row.I need some space`

will not work).

## Closing

You made it. Thanks for reading! I hope you learned something new and maybe even had some fun.

Let me know :) If you want to talk, find me on Mastodon or use my form over here.