Half-Elf on Tech

Thoughts From a Professional Lesbian

Author: Ipstenu (Mika Epstein)

  • Trust No One

    Trust No One

    A constant refrain for my security reviews of plugins and themes is to sanitize everything. And sometimes my pedantic nature of sanitizing everything leads people to ask me why I don’t trust users.

    The short answer is “Because I am one.”

    The longer answer is that I believe in three types of users, and I’m aware of their character flaws.

    Users

    Most people on the internet fall into this category. They know how to log in and check email and read a blog post. Maybe they leave comments, maybe not. These people are the most basic of them all, but that isn’t a bad thing at all. Users do what they want and don’t often think about the consequences, because for them there really are none except saying things they didn’t mean or wanting a comment deleted.

    These users are dangerous because they don’t think about what they’re doing. They trust, blindly sometimes, that the websites they visit won’t hurt them. That means that data they input has to be sanitized and validated because they may not realize what they’re doing. They may put a URL in the email field of your comment form, and they should be warned about those things.

    You can’t trust the users because they don’t know any better.

    Experienced Users

    This is actually not the most dangerous category. You might think they would be, because they know enough to be dangerous. Instead, I’ve found these users know enough to be cautious. They know what they don’t know, and they’re careful about what they’re doing. They check six or eight times before they paste in data, and they read error messages. Oh yes, these people. You know them, don’t you? They send screenshots of errors a test out theories before telling you “This is broken.”

    We like those people, though you may be wondering what about the experienced users who don’t do the legwork. To me, they’re users. There’s nothing wrong with being a user, but it changes my expectations on what they do and who they are. If someone is experienced, though, they’re going to play with things and that means they might break things when they try to recreate the problems.

    You can’t trust the experienced users because they mean well.

    Admin Users

    These are the users who terrify me the most, and sadly, this is where most WordPress users actually are. Because if you’ve installed your own version of WordPress, you are an admin user. God save your soul. And here’s why they scare me: they have more power the an the experience user but the skill of a user. They’re kind of like toddlers.

    This is not meant as an insult. The problem is that, unchecked, they can destroy their own sites. They copy and paste code or content into the post editor. In fact, that’s the biggest problem. Many years ago, my friend John and I spent five days debugging a crash, all because we didn’t know that no one who knew what they were doing would ever enter that data format into a field, and since we were admins, the check was overridden.

    You can’t trust the admin users because they have phenomenal cosmic powers.

    Trust No One

    Not to sound all Fox Mulder on you, trust no one’s data. Especially not your own. Don’t assume you know what you’re doing, that you never typo, that you’re always right. You’re not. No one is. And we don’t trust data because we could be wrong. It’s just that simple.

  • Linear Regressions in PHP

    Linear Regressions in PHP

    Sometimes math exists to give me a headache.

    In calculating the deaths of queer females per year, my wife wondered what the trend was, other than “Holy sweat socks, it’s going up!” That’s called a ‘trendline’ which is really just a linear regression. I knew I needed a simple linear regression model and I knew what the formula was. Multiple the slope by the X axis value, and add the intercept (which is often a negative number), and you will calculate the points needed.

    Using Google Docs to generate a trend line is easy. Enter the data and tell it to make a trend line. Using PHP to do this is a bit messier. I use Chart.js to generate my stats into pretty graphs, and while it gives me a lot of flexibility, it does not make the math easy.

    I have an array of data for the years and the number of death per year. That’s the easy stuff. As of version 2.0 of Chart.js, you can stack charts, which lets me run two lines on top of each other like this:

    var myChart = new Chart(ctx, {
        type: 'bar',
        data: {
            labels: ['Item 1', 'Item 2', 'Item 3'],
            datasets: [
                {
                    type: 'line',
                    label: 'Line Number One',
                    data: [10, 20, 30],
                },
                {
                    type: 'line',
                    label: 'Line Number Two',
                    data: [30, 20, 10],
                }
            ]
        }
    });
    

    But. Having the data doesn’t mean I know how to properly generate the trend. What I needed was the most basic formula solved: y = x(slope) + intercept and little more. Generating the slope an intercept are the annoying part.

    For example, slope is (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2) where,

    • x and y are the variables.
    • b = The slope of the regression line
    • a = The intercept point of the regression line and the y axis.
    • N = Number of values or elements
    • X = First Score
    • Y = Second Score
    • ΣXY = Sum of the product of first and Second Scores
    • ΣX = Sum of First Scores
    • ΣY = Sum of Second Scores
    • ΣX2 = Sum of square First Scores

    If that made your head hurt, here’s the PHP to calculate it (thanks to Richard Thome ):

    	function linear_regression( $x, $y ) {
    
    		$n     = count($x);     // number of items in the array
    		$x_sum = array_sum($x); // sum of all X values
    		$y_sum = array_sum($y); // sum of all Y values
    
    		$xx_sum = 0;
    		$xy_sum = 0;
    
    		for($i = 0; $i < $n; $i++) {
    			$xy_sum += ( $x[$i]*$y[$i] );
    			$xx_sum += ( $x[$i]*$x[$i] );
    		}
    
    		// Slope
    		$slope = ( ( $n * $xy_sum ) - ( $x_sum * $y_sum ) ) / ( ( $n * $xx_sum ) - ( $x_sum * $x_sum ) );
    
    		// calculate intercept
    		$intercept = ( $y_sum - ( $slope * $x_sum ) ) / $n;
    
    		return array( 
    			'slope'     => $slope,
    			'intercept' => $intercept,
    		);
    	}
    

    That spits out an array with two numbers, which I can plunk into my much more simple equation and, in this case, echo out the data point for each item:

    foreach ( $array as $item ) {
         $number = ( $trendarray['slope'] * $item['name'] ) + $trendarray['intercept'];
         $number = ( $number <= 0 )? 0 : $number;
         echo '"'.$number.'", ';
    }
    

    And yes. This works.

    Trendlines and Death

  • (Slightly) More Performant WP Queries

    (Slightly) More Performant WP Queries

    One of the things 10up lists as a best engineering practice is this:

    Do not use posts_per_page => -1.
    This is a performance hazard. What if we have 100,000 posts? This could crash the site. If you are writing a widget, for example, and just want to grab all of a custom post type, determine a reasonable upper limit for your situation.

    This is a very valid point, but I found myself stymied at how to work around it in a case where I knew I needed to check all posts in a custom type. And worse that post type was growing every week by 10 to 20. In my case, the reasonable upper limit was an unknown that was also unpredictable. But an endless loop would also be bad.

    One of the other recommendations from 10up is not to run more queries than needed. I was already using no_found_rows => true to prevent counting the total rows, as it’s really only necessary for pagination. I also force in the post type I’m scanning, which again limits how many possible items will be queried. And yes, I have update_post_meta_cache and update_post_term_cache set to false as in most cases those aren’t needed either.

    But what could I do to make the actual query of getting how many posts of type A had a post meta value that matched post type B? And to make it worse, I could have multiple values in the post metas. It’s really a case where, in retrospect, making them into a custom taxonomy might have been a bit wiser.

    What I decided to do was limit the number of posts queried based on how many posts were in the post type.

    posts_per_page => wp_count_posts( 'custom_post_type' )->publish;

    I’m not quite concerned with 100,000 posts, and I have some database caching installed to mitigate the load. But also I have set an upper limit and this feels less insane than the -1 value. Since I had to generate the count anyway for displaying statistics, I moved that check to a variable and called it twice.

  • Access: Denied or Granted?

    Access: Denied or Granted?

    One of the topics I discuss regularly with developers it that of access. When writing any code, we must always consider who should have access to it. Not in the ‘who can look at my code?’ aspect, but that of who can run the code.

    This happens a lot with plugins and themes when people create their own settings pages. While I’m the first to admit that the settings API in WordPress is a bag of wet hair that makes little logical sense, using it gives you access to a lot of built in security settings, such as nonces.

    But as I often tell people, a nonce is not bulletproof security. We cannot rely on nonces for authorization purposes.

    What is a Nonce?

    A nonce is a word or expression coined for or used on one occasion. But in security, a nonce is an arbitrary number or phrase that may only be used once. Due to it’s similarity to a nonce word, it reuses the name. In WordPress, it’s a pseudo-random number created on the click of (say) a save button, to ensure that someone had to press the button in order to run the function.

    For example. If your settings page has a nonce, then the nonce is generated on click and passed to the function(s) that validate and sanitize and save. If the nonce doesn’t check out, then WordPress knows someone didn’t press a button, and it won’t run the save. This prevents someone from just sending raw data at your code.

    Why isn’t that enough?

    With just a nonce, anyone who has access to the page can save data. This is okay when you think about something like a comment form or a contact form. Those are things that need to be accessible by non logged in users, right? What about the WordPress General settings page? The one that lets you change the URL of the website? Right. You only want that to be accessible to admins. Imagine if all logged in users could change that one. Yikes!

    In order to protect those pages, you have to consider who should have access to your code.

    1. Are the users logged in or out?
    2. What user permissions should be required?
    3. How much damage can be caused if the data is changed?

    That’s it. That’s all you have to ask. If it’s a contact form, then a nonce is all you need. If it’s a site definition change, then you may want to restrict it to admins or editors only.

    The Settings API

    When you use the settings API, some of this is made pretty straightforward for you:

    add_action( 'admin_menu', 'my_plugin_admin_menu' );
    function my_plugin_admin_menu() {
        add_options_page( 'My Plugin', 'My Plugin', 'manage_options', 'my-plugin', 'my_options_page' );
    }
    

    In that example, the third value for the options page is manage_options which means anyone who can manage site options (i.e. administrators) can access the settings page.

    The problem you run into is if you want a page to do double duty. What if you want everyone to see a page, but only admins can access the settings part? That’s when you need to use current_user_can() to wrap around your code. Just check if a user can do a thing, and then let them in. Or not.

    What Permissions are Best?

    In general, you should give admins access to everything, and after that, give people as little access as possible. While it may be annoying that an editor can’t, say, flush the cache, do they really need to? You have to measure the computational expense of what’s happening before you give access to anyone. It’s not just “Who should do a thing?” but “What are the consequences of the thing?”

    Look at the cache again. Flushing a cache, emptying it, is easy. But it takes time and server energy to rebuild. By deleting it, you force your server to rebuild everything, and that will slow your site down. An admin, who has access to delete plugins and themes, should be aware of that. An editor, who edits posts and menus, may not. And while some editors might, will all?

    This gets harder when you write your code for public use. You have to make hard decisions to protect the majority of users.

    Be as prudent as possible. Restrict as much as possible. It’s safer.

  • You Are Not Psychic

    You Are Not Psychic

    The other day I hear someone mention that they were securing software that they didn’t even have the words for 5 years ago. That reminded me of how fast technology moves. Things that didn’t exist last year are vulnerable today, and the discovery of those things only happens faster as we invent weirder and weirder ways of reinventing the wheel.

    The Myth of Perfection

    A well known saying is that the perfect is the enemy of the good. We take that to mean that if we wait until a thing is perfect, we will never feel it is done enough and ready enough for the world. In Open Source technology we’re fond of releasing and iterating, accepting the lack of perfection and aiming instead for the minimum. What is ‘good enough’ to let people use our work and improve on it?

    Perfection doesn’t exist. There is nothing on the planet that is perfect and there never will be. But accepting this doesn’t mean we allow imperfection and danger into our lives.

    The Balance of Reality

    Everyone screws up code, no matter how awesome a professional you are. Accept it 🙂

    I said that nearly three years ago, and I’ve argued before that it’s okay to screw up. I really do believe that making mistakes isn’t a bad thing. But at the same time, many people understood that to mean I was alright with shipping code that I knew was bad. This is not at all the case.

    If you know your code is bad or insecure, you fix it. Period. You don’t let things you know are bad out the door. But you do release things that work and perhaps lack all the features you want. There’s a difference between releasing bad code and releasing imperfect code.

    The Attack of Security

    To turn this on its ear a little, if someone comes up to you and says “This code is wrong, you should do this other thing.” then you have a choice. You can listen and believe them and study if they’re right and test it and fix it, or you can ignore it.

    When someone you respect tells you those things, you’re inclined to believe them. But at the same time, your heart takes a hit because “this code is wrong” sounds a lot like “this code is bad.” And that sounds like “you wrote bad code” and that feels like “you’re a bad coder.”

    That slope slipped right down, didn’t it? It’s a reality. It’s in our nature to take admonishments of our work as a personal attack, even if they are rarely meant that way. I can count on my hands the number of times I’ve actually told someone they were a bad coder. I can count on one hand the number of times I’ve told someone who wasn’t me that they’re a bad coder. I’ve certainly said it to myself a lot. I believe I’ve told one person directly that I thought they were a bad coder.

    I don’t think they actually understood what I meant… Which says something.

    The Shield of Arrogance

    We are not psychic. If you are, please let me know who to bet on for the World Series. Being humans, and therefore fallible, we cannot be so arrogant as to presume we know all the possible ways our code might be vulnerable. Technology moves so fast that what looks safe today may turn out to be terrible dangerous tomorrow.

    Knowing this, knowing we are imperfect, we know that our fellow humans are also imperfect. The greatest danger to our security is ourselves. And that means, as developer writing code to be used by others, it’s incumbent upon ourselves to protect our fellow humans from innocent mistakes.

    You are not psychic

    You don’t know what will be insecure next. You can’t. So secure your code as best you can and secure it better if people point out your shortcomings. Learn. Improve. Protect.

  • Datepicker and a Widget

    Datepicker and a Widget

    Last week, I worked on making a plugin that would safely and smartly allow for a date selection, and not permit people to put in junk data. I had my code ‘clean’ and only accepted good data, there was one question remaining. How do I make it look GOOD?

    Let’s be honest, pretty data matters. If things look good, people use them. It’s that simple. This let me play with another aspect of coding that I don’t generally look at. Javascript.

    What Code Do I Need?

    There are a lot of ways to tackle this kind of problem. If you wanted to just list the months and days, and have those be drop-downs, you could do that. The option I went with was to have a calendar date-picker pop up, as I felt that would visually explain what the data should be.

    To do that I needed a date picker jQuery script (which is included in WordPress core) and a second script to format the output.

    My Script

    This part is really small:

    jQuery(function() {
        jQuery( ".datepicker" ).datepicker({
            dateFormat : "mm-dd"
        });
    });
    

    All it does is force the format to be “mm-dd” – so if you picked the date, that’s what it would be.

    Enqueuing the Scripts

    In order to make sure the scripts are only loaded on the widgets page, my enqueue function looks like this:

    	public function admin_enqueue_scripts($hook) {
    		if( $hook !== 'widgets.php' ) return;
    		wp_enqueue_script( 'byq-onthisday', plugins_url( 'js/otd-datepicker.js', __FILE__ ), array( 'jquery-ui-datepicker' ), $this->version, true );
    		wp_enqueue_style( 'jquery-ui', plugins_url( 'css/jquery-ui.css', __FILE__ ), array(), $this->version );
    	}
    

    The CSS is because, by default, WordPress doesn’t include the jquery UI CSS.

    Calling the Scripts

    In the widget class, I have a function for the form output. In there, I have an input field with a class defined as datepicker, which is used by the jquery I wrote above, to know “I’m the one for you!”

    	function form( $instance ) {
    		$instance = wp_parse_args( (array) $instance, $this->defaults );
    		?>
    		<p>
    			<label for="<?php echo esc_attr( $this->get_field_id( 'title' ) ); ?>"><?php _e( 'Title', 'bury-your-queers' ); ?>: </label>
    			<input type="text" id="<?php echo esc_attr( $this->get_field_id( 'title' ) ); ?>" name="<?php echo esc_attr( $this->get_field_name( 'title' ) ); ?>" value="<?php echo esc_attr( $instance['title'] ); ?>" class="widefat" />
    		</p>
    
    		<p>
    			<label for="<?php echo esc_attr( $this->get_field_id( 'date' ) ); ?>"><?php _e( 'Date (Optional)', 'bury-your-queers' ); ?>: </label>
    			<input type="text" id="<?php echo esc_attr( $this->get_field_id( 'date' ) ); ?>" name="<?php echo esc_attr( $this->get_field_name( 'date' ) ); ?>" class="datepicker" value="<?php echo esc_attr( $instance['date'] ); ?>" class="widefat" />
    			<br><em><?php _e( 'If blank, the date will be the current day.', 'bury-your-queers' ); ?></em>
    		</p>
    		<?php
    	}
    

    Making it Pretty

    To be honest, once I got the JS working, I left the default CSS alone. Why? Because I’m a monkey with a crayon when it comes to design. The default worked fine for me:

    The Default Date Picker

    It does make me think that it would be nice if WordPress included their own customize datepicker colors in the admin colors, but I understand why they don’t. Not everyone or even most people will ever need this.