How To

Stacked Charts Part 1: Understanding Your Data

Chart.js, WordPress, and taxonomies begins with understanding the data you’re processing and saving it in a retrievable way.

There are a few different type of charts. Actually there are a lot. I find a nice bar chart fairly easy to read and understand. So when Tracy said we should generate some nice stats about nations, like how many shows there were per nation, I was able to do that pretty easily:

An excerpt of shows by nation - USA has the most. Yaaaay.
An excerpt of shows by nation

And as far as that goes, it’s pretty cool. It’s really just the same code I use to generate category statistics already. This is, by the way, why using WordPress to generate your data is useful. It’s easy to replicate code you’ve already got.

But then Tracy, who I think derives some perverse joy out of doing this to me, says “Can we find out how many trans characters there are per nation?”

Use WordPress First Use WordPress First

If you heard my talks about Sara Lance, you’ve heard me tout that data based sites should always use WordPress functions first. By which I mean they should use taxonomies and custom post types when possible, because accessing the data will be consistent, regular, and repeatable.

Ironically, it’s because I chose to use WordPress than I was in a bit of a bind.

You see, we have three post types on the site right now: shows, characters, and actors. The shows have the taxonomy of ‘nation’ so getting that simple data was straightforward. The characters store the taxonomies of gender identity and sexual preference. That sounds pretty logical, right?

So how, you may wonder, do we get a list of characters on a show? A query. Basically we search wp_post_meta for all characters with the array of lezchars_show_group and, within that multidimensional, have a show of the post ID of the show saved. Which means the characters are dynamically generated every single time a page is loaded. And yes, that is why I use The L Word as my benchmark for page speed.

However by doing all this dynamically, generating the stats for characters per nation would look like this:

  1. Use get_terms to get a list of all shows in a nation to …
  2. Loop through all those shows and …
  3. Loop through all the characters on each show to extract the data to …
  4. Store the data per nation

Ouch. Talk about slow.

Top ↑

Solution? Use WordPress! Solution? Use WordPress!

Thankfully there was a workaround. One of the other odd things we do with shows is generate a show ‘score’ – a value calculated by the shows relative awesomeness, our subjective enjoyment of it, and the number of characters, alive or dead, it has.

In order to make that generation run faster, every time a show or character is saved, I trigger the following post_meta values to be saved:

  • lezshows_characters – An array of character counts alive and dead
  • lezshows_the_score – The insane math of the score

So I added three more:

  • lezshows_sexuality
  • lezshows_gender
  • lezshows_romantic

All of those are generated when the post is saved, as it loops through all the characters and extracts data.

Top ↑

Generate The Base Generate The Base

In order to get the basics, we start by generating an array of everything we’re going to care about. I do this by listing all the taxonomies I want to use and then loop through them, adding each slug to a new array with a value of 0:

$valid_taxes = array( 
	'gender'    => 'lez_gender',
	'sexuality' => 'lez_sexuality',
	'romantic'  => 'lez_romantic',
$tax_data = array();

foreach ( $valid_taxes as $title => $taxonomy ) {
	$terms = get_terms( $taxonomy );
	if ( ! empty( $terms ) && ! is_wp_error( $terms ) ) {
		$tax_data[ $title ] = array();
		foreach ( $terms as $term ) {
			$tax_data[ $title ][ $term->slug ] = 0;

That gives me a multidimensional array which, I admit, is pretty epic and huge. But it lets move on to step two, of getting all the characters:

$count          = wp_count_posts( 'post_type_characters' )->publish;
$charactersloop = new WP_Query( array(
	'post_type'              => 'post_type_characters',
	'post_status'            => array( 'publish' ),
	'orderby'                => 'title',
	'order'                  => 'ASC',
	'posts_per_page'         => $count,
	'no_found_rows'          => true,
	'meta_query'             => array( array(
		'key'     => 'lezchars_show_group',
		'value'   => $post_id,
		'compare' => 'LIKE',
) );

Next I stop everything as a new array. Which is where we get into some serious fun. See, I have to actually double check the character is in the show, since the ‘like’ search has a few quirks when you’re searching arrays. The tl;dr explanation here is that if I look for shows with a post ID of “23” then I get “23” and “123” and “223” and so on.

Yeah. It’s about as fun as you’d think. If I wasn’t doing arrays, this would be easier, but I have Sara Lance to worry about.

if ($charactersloop->have_posts() ) {
	while ( $charactersloop->have_posts() ) {
		$char_id     = get_the_ID();
		$shows_array = get_post_meta( $char_id, 'lezchars_show_group', true );

		if ( $shows_array !== '' && get_post_status ( $char_id ) == 'publish' ) {
			foreach( $shows_array as $char_show ) {
				if ( $char_show['show'] == $post_id ) {
					foreach ( $valid_taxes as $title => $taxonomy ) {
						$this_term = get_the_terms( $char_id, $taxonomy, true );
						if ( $this_term && ! is_wp_error( $this_term ) ) {
							foreach( $this_term as $term ) {
								$tax_data[ $title ][ $term->slug ]++;

You’ll notice there’s a quick $tax_data[ $title ][ $term->slug ]++; in there to increment the count. That’s the magic that gets processed all over. It tells me things like “this show has 7 cisgender characters” which is the first half of everything I wanted.

Because in the end I save this as an array for the show:

foreach ( $valid_taxes as $title => $taxonomy ) { 
	update_post_meta( $post_id, 'lezshows_char_' . $title , $tax_data[ $title ] );

Top ↑

How Well Does This Run? How Well Does This Run?

It’s okay. It’s not super awesome, since it has to loop so many times, this can get pretty chunky. See The L Word and it’s 60+ characters. However. It only updates when the show is saved, or a character is added to the show, which means the expensive process is limited. And by saving this data in an easily retrievable format, I’m able to do the next phase. Generate the stats.