Archive for the ‘Programming’ Category.

PHP Tip: Always Put Constants on the Left in Boolean Comparisons

This was a standard I enforced at my last company:

Whenever you are doing a boolean check (such as an IF or WHILE), always place constants on the left side of the comparison.

The following is BAD:

// BAD
if($user == LOGGED_IN) {

The following is good:

// GOOD
if(LOGGED_IN == $user) {

Why is this such a big deal? Imagine the typo where you forget the second equals sign:

// Oops! This always evaluates to true!
if($user = LOGGED_IN) {

This sort of bug is fairly common. C# went as far as to say boolean conditions must always have boolean return values, thereby eliminating the possibility of accidental assignments. Well, since PHP can’t do that, this is the next best thing. Notice how this convention will save your butt:

// Fatal error. Bug caught immediately.
if(LOGGED_IN = $user) {

Think about it. :)

Is Your Blog Not Receiving Pingbacks? I Fixed Mine.

I recently noticed that my blog was no longer registering pingbacks (the automatic in-comment notification that occurs when somebody else blogs about your post). I like these because they help me understand which of my articles are gaining traction.

My symptoms

  • My other blogs hosted on the same server seem to be pinging fine; however, those have far less posts and plugins
  • I am able to send pingbacks, apparently
  • But ping backs TO my content were dropped (even when I am self-pinging)

The fix

I figured the issue was somehow related to my recent upgrades of WordPress. After scouring the web, I found that the issue was due to a poorly designed timeout setting in WordPress.

  1. Open wp-includes/cron.php in your blog folder
  2. Go to the line that starts with: wp_remote_post( in the spawn_cron function
  3. Change ‘timeout’ => 0.01 to ‘timeout’ => 1 (or any other far more reasonable value)

This will fix blogs that are plagued by this bug.

Autocast Variables Whitepaper: What I Want to See in PHP 6

Edit: I’ve moved the autocast white paper to its own page. Let me know what you think!

How would you answer the following question:

Imagine you are now in charge of PHP. What do you cut/add/change in PHP 6.0?

I ask this during interviews, and as you can imagine, I get all sorts of answers. The best answers are pulling in features from other languages, particularly OOP concepts. These answers aren’t bad, but they almost always try to “fix” PHP’s broken OOP while also crippling the strength of a loosely typed language. I have my own unique answer to this question, and I wanted to share.

Maybe somebody else has already thought of this, but if not, I’m going to coin it right here:

Autocast variables. An autocast variable is like a container for data — everything going into an autocast variable type will always be converted to the current type of that variable. As in, if you assign a string into an integer variable, the variable will become the integer representation of the string (via implicit and immediate typecasting).

The idea is a hybrid of limited type safety – where only some variables are type safe – and operator overloading of the equals sign – on native datatypes. To help explain the idea: it would act almost like somebody following around your cursor and typing (int), (string), etc. all over your code before all variable assignments.

The goal is to allow a developer to be – when desired – 100% certain they are working with a specific data type.

Introduction to Autocasting

To declare a variable as an autocast, simply place a colon after the dollar sign in a variable name. Then, everything assigned to that variable is now automatically typecast to the datatype of the variable. For example

// This variable is now a container for integers
$:orderTotal = 0;
// assign a float value
$:orderTotal = 1.01;
// outputs 1; 1.01 was typecast to an integer
echo $:orderTotal;

NOTE: Why the new syntax? I toyed with the idea of an autocast keyword, but the paradigm broke down when you started assigning objects. The problem is that objects are pass-by-reference. This meant a programmer could change the datatype of an autocast variable by altering its reference. The other problem was that by not having a visual marker, it would make things very confusing  since one could never tell if they were working with an autocast until runtime. Lastly, why the dollar-colon? I would have prefered straight colon, but most of the good single-character syntax would conflict with existing PHP systems (# is a comment, : is used in ternary operators, % is modulus, ^ is a bitwise operator, etc.). A dollar sign is universally understood as a variable, so I thought the next best thing was to alter the variable in a way that today’s PHP would recognize as invalid (and thus introducing the syntax would not conflict with legacy code).

The concept is simple, but gets more complicated as you introduce objects, magic methods, and method signatures into the equation. Don’t worry, I’ve thought about all of those scenarios. Key summary of benefits:

  • New coding paradigms allow for simpler interaction between different data types (see first Practical Example)
  • Refactoring can be done in a way never before possible (see second Practical Example)
  • Code is now more “reliable” because unintended data types aren’t used (such as during boolean checks)
  • Many fatal errors can now be avoided
  • Potential use in the realm of dependency injection
  • Possibilities for true function overloading since expected datatypes are known (although, this is possible today, to be honest)

Edit: read the rest here.

A PHP/MySQL Bug Most People Have But Don’t Realize

I’ve seen this over and over in my career and thought I should save others from the horror. Part of me feels like I blogged about this years ago, but I couldn’t find a post referencing it (EDIT: found it!). The bug is simple:

  1. Create a database table with a decimal value, such as order_total
  2. Write some code that retrieves the row
  3. Do an implicit boolean check on order_total to see if it has a value

Here’s some actual code:

$results = mysql_query("SELECT * FROM orders");
while($row = mysql_fetch_assoc($results)) {
    if($row['order_total']) {
        echo "Order total is clearly not zero!";
    }
    else {
        log_bad_order($row['id']);
    }
}

This code has a serious bug in it. The problem is the line pertaining to checking if the order_total has been set. Pop quiz:

What is the value of the following:
(bool) “0.00″

The answer is TRUE! 0.00 may evaluate to zero, but “0.00″ is not the same thing! As soon as PHP sees more than just a single “0″ in a string, it assumes it’s a regular string and treats it as a non-zero string. A more obvious way to ask the same question:

What is the value of the following:
(bool) “0.0000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000″

Or what about:

What is the value of the following: (bool) “0.”

The point is that as soon as you go beyond a single zero, PHP just assumes the rest is real data and will not discard it. Thus:

if(0.00) {
    echo "THIS NEVER EXECUTES";
}
if("0.00") {
    echo "THIS ALWAYS EVALUATES TO TRUE";
}

SO going back to the original problem, the way to solve is is by fixing the code to either explicitly type cast the variable or use a “greater than” check:

$results = mysql_query("SELECT * FROM orders");
while($row = mysql_fetch_assoc($results)) {
    if((float) $row['order_total'] > 0) {
        echo "Order total is clearly not zero!";
    }
    else {
        log_bad_order($row['id']);
    }
}

If you don’t do either of these things, I’d suggest you go and double check some of your code.

Representing heirarchical data in MySQL

I’ve always wondered if there was a better way to manage nested data structures (such as product categories) in MySQL. Today I stumbled across a solution called the Nested Set Model.

The only addition I made to the solution is rename what they call the “category_id” and call that a “sort_id”. Then I added a primary key called “id” to the table. This way, I have an immutable ID I can use in the application (such as for URL deep linking). For example:

CREATE TABLE nested_category (
  id INT AUTO_INCREMENT PRIMARY KEY,
  sort_id INT NOT NULL,
  name VARCHAR(20) NOT NULL,
  left_sort INT NOT NULL,
  right_sort INT NOT NULL
);

Is PHP here to stay?

As a LAMP developer, I am starting to question the long term viability of PHP. PHP was born during an era when knowing HTML was a valid and valuable resume bullet. Because of this, most of the “advanced” aspects of PHP — which relate to the OOP functionality — were introduced only after PHP 4/5, and weakly at that. Additionally, new languages have since become popularized that show the weakness of PHP. Don’t get me wrong, I am very supportive of PHP. I just believe that it’s important that people understand both the strengths and weaknesses of the tools they use.  There are two main points I want to cover:

  1. PHP thread support is weak
  2. PHP OOP = Broken

The second point is rather technical, but it closely relates to another strength and weakness of PHP: it is loosely typed. More on that later.

Thread Support is Weak

True threading support in PHP does not exist. The closest thing is the pcntl_fork method, which copies the current process, rather than create a thread. This means asynchronous processing within a single process is not supported. Threading is useful in event-driven architectures (common in JavaScript) or when doing blocking operations such as network calls.

Because the forked process is a clone of the original, it shares all of the original resources, including database and file resources. This means that the forked process must be self-aware of whether it is a child or not, and must be careful not to modify or close these resources. This encourages spaghetti code that contains large logic forks (“if I am not a clone, else…”). Because of this, forking is messy and error prone. This gets further complicated when PHP is executed by Apache in a web environment. In fact, the PHP manual advises avoiding forking with web servers:

Process Control should not be enabled within a web server environment and unexpected results may happen if any Process Control functions are used within a web server environment.

Not to mention the method is incredibly C-like in that it is very “raw” (unlike other native PHP methods/classes). This increases the barrier to entry significantly, which ultimately serves to have the feature ignored by most shops.

Why is all of this important? Well, at most companies, one language is selected for all in-house development. This is because cross training and hiring is simplified if everybody speaks the same language. There are a few common tasks that are unnecessarily difficult to do in PHP:

  • Asynchronous work — handing off work such as connecting to a remote server to a child and wait for a response
  • Manage thread pools — this sort of work requires significant “by-hand” management of any processes spawned by the parent via pcntl_fork

The threading issue is only a pain point that impacts processes that need to become parallelized. It is a pain most big shops live with, or, alternatively, introduce other languages to help solve.

PHP OOP = Broken

Because of the loosely typed nature of PHP, true, well-formed object oriented programming is broken. I know that for many PHP programmers, “Object Oriented” means putting together classes and reusing code as objects. However, that is truly, sincerely, only a portion of the point of OOP. Some of the most powerful aspects of OOP are lost in PHP’s implementation of the concept. Don’t get me wrong: these decisions were probably the right fit for the niche PHP was filling, but I don’t believe most PHP programmers are fully aware of what they are missing.

While the language, thankfully, has interfaces and abstract classes, they are woefully underused. This is, in part, due to to the developer community being largely self-taught. This creates a misconception about the nature of OOP, which ultimately leads to the devaluation of the most important feature of OOP: interfaces.

I can go into why they are so important in another article, but the point is: without interfaces, true polymorphic code is impossible. Or, rather, extremely susceptible to spaghetti code and fatal errors.

In other languages (Java), code might look like this:

interface Animal { void makeSound(); }
void farm(Animal cat, Animal dog, Animal parrot) {
  cat.makeSound();
  dog.makeSound();
  // Note: Parrot class *DOES* have a method called moveAround()
  parrot.moveAround(); // ERROR!

The interface in this example defines a uniform way to access a class through a standardized API (thus the name, application programming interface). In a strongly typed language where all variables must have a type, the cat variable is defined as an implementation of Animal. This enforces and allows the method call makeSound(). If cat has a meow() and dog has a woof() method, they can not be called here without a compiler error. This is because in this function call, the parrot variable is defined as being an instance of Animal (versus being a Dog, Cat, or Parrot). As such, only Animal methods work here.

More importantly, because the compiler does this type checking, any invalid calls, such as the last one, would error and never compile. Even if the Parrot class has a moveAround() method, it can not be called in the code above. This is an extremely important aspect of OOP since, as a definer of the Animal class, I want to make it very specific how Animals should be treated (you can only makeSound!). If a programmer tries to do something to an Animal that I haven’t defined, they get an error. If they wanted to make that last line work, they would need to use object typecasting:

void farm(Animal cat, Animal dog, Animal parrot) {
  ...
  ((Parrot) parrot).moveAround();

Or by changing the function definition:

void farm(Animal cat, Animal dog, Parrot parrot) {
  ...
  parrot.moveAround();

But note that in this case, the user had to make an explicit choice to stop using Animal’s interface. Yes, parrot is still an Animal, but it doesn’t have to be. This, in short, helps prevent spaghetti code because it forces the developers to think about whether or not they want to deviate from a particular interface. Realistically, if presented with these alternatives, a Java programmer would probably use other types of abstraction techniques (e.g., dependency injection)  to keep this method from needing to be used. However, this example was necessary to illustrate how things are done in PHP.

So how would this look in PHP? Why isn’t this the same there? Well, take a look at the following code that, unlike the Java example, works perfectly fine and raises no red flags.

interface Animal { function makeSound(); }
function farm(Animal $cat, Animal $dog, Animal $parrot) {
  $cat.makeSound();
  $dog.makeSound();
  $parrot.moveAround(); // WORKS FINE 
}

This code works great. We have three arguments all forced to use the Animal interface. Great. As a casual observer, there is really, truly, nothing wrong with this code. It’s a little strange, but if it’s commonly known that Birds can moveAround(), there is no problem. In fact, in most PHP shops, I will bet money that type hinting is NOT used. This will further illustrate how bad the spaghetti is about to get (read on).

Now imagine in six months if we decide we wanted to group up this code so that it uses a single array/collection as an argument. This is where things would look like traditional polymorphic code. I mentioned spaghetti above. Let me show you why:

interface Animal { function makeSound(); }
function farm(array $animals) { // note, we can't guarantee what's inside of this array
  foreach($animals as $animal) {
    if($animal instanceof Parrot) { // or maybe a method_exists() call?
      $animal.moveAround(); // SPAGHETTI
    }
    else {
      $animal.makeSound(); // Hope for no fatal errors!
    }
  }
}

Wow, look at what we just did. A harmless piece of code in PHP six months ago completely breaks when you try to refactor it to use a fairly typical design pattern. More importantly, unless I put in even MORE code to do type checking, there’s a chance that the makeSound() line will actually die in a fatal error if, for example, a string is passed in as an element of the argument array! See example without Parrots:

interface Animal { function makeSound(); }
function farm(Array $animals) { // note, we can't guarantee what's inside of this array
  foreach($animals as $animal) {
    $animal.makeSound(); // Hope for no fatal errors!?
  }
}

PHP is extremely flexible when it comes to hacking out a page, but when it comes to OOP, it’s about as brittle as you get. Refactoring is painful and error prone, and elegant design patterns like the ones you might see in a message-passing language such as Objective-C, Scala, or Erlang don’t work. Remember that by using functions such as method_exists() and is_object(), I can emulate the desired behavior; however, the extra code means more places for bugs and less time spent making the program do what you want it to do. The point is that the OOP constructs in PHP don’t fully work. As a result, certain very important aspects of OOP don’t translate very well to PHP.

Some people may still cling on to the notion that “ultimately, you can still do it, it just requires more code!” But I argue that preventing “more code” is the exact reason why OOP was invented. By writing more boiler plate error checking code, we are wasting time. The issue is exacerbated by the fact that the error checking code isn’t required, unlike say, if you were throwing exceptions. It isn’t immediately obvious in that last example that you need to do error checking for is_object() on the $animal variable. It’s these types of oversights that really damage PHP as the code base gets larger.

Conclusion

What I’m realizing is that PHP isn’t meant to scale. Yes, it can take a lot of web traffic, but that’s not what I mean. I’m talking about scaling in the sense of growing team size and code base. The design of the language promotes coding paradigms that ultimately damage the code base. This is because PHP makes it harder use good OOP practices on legacy code. To illustrate:

  • PHP became popular because it is easy to hack things out, even if that something required doing it the “wrong” way. These problems come back and bite you when the code base grows.
  • PHP can’t support a large development team as effectively because its weak typing allows for sidestepping certain core OOP principles (see above)
  • PHP  allows for invisible future-bugs (see above) to be inserted without any immediate cause for alarm
  • As applications get complex and require threading or distributing of processes, PHP fails to keep up (so other languages get used)
  • Because PHP does not use dynamic dispatching (message passing), calling a method can cause runtime FATAL ERRORS (unacceptable and very hard to debug!)

All of this makes me rethink the popularity of PHP. There are some new languages, still in their infancy, that pose a threat to PHP’s current dominance. I believe that in the next few years, as today’s systems become “legacy,” today’s newcomers will finally be production ready. At that point, we might see companies adopt the newer languages, which will support more modern programming paradigms. We are seeing this today with Ruby, for example.

Of course, I could be wrong. I once told people that PHP was “C of the web.” It’s possible it’s here to stay forever, despite all of its flaws. And, for the record: I do not believe Python or Ruby will be the language that will overtake PHP, but that’s for another post.

I just want everybody to know that I am a PHP developer, so I speak from experience. We should recognize that technology changes and evolves, and it is important that we constantly update our skill to ensure they don’t become obsolete. I’m just pointing out that perhaps PHP isn’t as timeless as C (or, possibly, Java).

Lastly, I will plug my personal belief that being “religious” about a language because it is “the best” is short sighted. New languages are born, literally, every week. It’s only a matter of time before a language comes along that does what your language does more elegantly, faster, and with less code.

Only time will tell. :)

Q: Hiding JS Files? A: Impossible

In my popular post about hiding your Word Press folder, a reader asked:

Hi Michi, can you help me with this, in the head section i wrote this:

<script src="/style/js/somescripts3.js” type=”text/javascript” charset=”utf-8″>

and when we go to the webpage then right click, it will show:

<script src="content/themes/exampletheme/exampletheme/style/js/somescripts3.js”
  type=”text/javascript” charset=”utf-8″>

can you teach me or show me how to do that, any help highly appreciated, And im so sorry if my english not good.

This question was complicated enough where I thought a new post might make sense.

For clarification, I believe he is asking if it’s possible to put one thing in the source and another that the browser sees. This is impossible. Anything that the browser can see, the user can see. There is no way to “show” something different in the source of an HTML file versus what the browser sees (except through obfuscation); however, you can forward things along behind the scenes. You want to create an htaccess rule that will redirect your requests.

RewriteRule /path/to/thejsfileyouwanttoshow\.js /path/to/real/js/file.js [L]

Let me reiterate that you *cannot* hide the content of the JS file. However, you *can* hide the true folder structure of the web server. If you desire to hide your JS contents, the better solution is a JS minifier.

Alternatively, if your goal is to somehow make it harder for somebody to steal your code and you don’t want ot use a JS minifier, you could write the JavaScript tag dynamically using another piece of JavaScript. However, ultimately, that level of weak obfuscation won’t protect you from anything since Firebug will quickly expose what’s really going on.

I hope that answers your question.

Rainbow Google and Annoying Google

I launched two more Google parodies: Rainbow Google and Annoying Google!

Rainbow Google is just a pretty demonstration of dynamic stylesheet modification. It was actually extremely hard to code — it took me about 8 hours of JavaScript hell. On another day, I’ll go over how I did it. This site adds a colorful spray of colors to the text on the page (see screenshot).

rainbowgoogle.com

RainbowGoogle.com

Annoying Google was about 10 minutes of work since it was just a super simple version of Rainbow Google’s code. :) Search queries and results are jumbled so that their letters are in random capitalized states… LiKe ThiS.

annoyinggoogle.com

AnnoyingGoogle.com

If you have suggestions or ideas, please let me know!

The Basics on Using Models and Controllers in PHP

Today I want to talk about passing in objects as arguments in PHP methods. Many PHP developers do not have this patience. This is obvious when studying libraries written for Java versus those in PHP. It is a horridly underused programming style in PHP, and since PHP supports argument type prototyping in methods, I thought it would be good to go over this particular style of development.

First of all, let’s start with a few “PHP” way of doing a mundane task (this will seem extremely familiar):

function login($username, $password, $remember);

function processMail($to, $from, $subject, $body);

function editPost($id, $title, $body, $newTime);

Of course, good developers would make sure these types of examples are part of a class:

$user = new User;
$user->username = $_POST["username"];
$user->password = $_POST['password'];
$user->login();

The examples can continue, and I hope that you good developers use this type of basic OOP development when doing using PHP. :) But there are examples where the standard “newbie” OOP model seemingly falls apart. The system quickly breaks down when new requirements are added to the system. For example, what if we want to do a “remember me” option in the login? What about logging in as an administrator versus a regular user? Okay, now what if we have different login session lengths depending on the user type? How long do you think that login function will be? Think about how it will accommodate IP bans, login logging, banned users, suspended users, max failed attempt lockouts, etc. The list goes on, and depending on your implementation, things can get very ugly. Your login function might become a huge bloated monster sitting in your User class.

The problem is that what you’re actually doing is mixing a data model [user data] with controller logic [how to login using the user data]. The solution is to separate these two entities into two classes, which is what you would see in most modern MVC frameworks.

Here is the seemingly more complex, but far more elegant solution:

$authenticationController = new UserAuthenticationController;
$user = new User;
$user->username = $_POST['username'];
$user->password = $_POST['password'];
$user->rememberMe = true;
$authenticationController->login($user);

The prototype for the login method would look like this:

function login(User $userObject);

In this example, I am hoping to show you possibilities. First, notice that the arguments for login() are down to one. But the more interesting part of the implementation comes with the proper abstraction between the User data and the authentication process. In my example, I Just logged in a regular user. So if I had to map out my class structure, it might look like this:

(Abstract class) AuthenticationController
=> GuestAuthenticationController
  => UserAuthenticationController
    => AdminAuthenticationController

The old way of doing things would look something like these:

function adminLogin($username, $password, $remember);

$user->adminLogin();

But using the Java-esk model, we’d end up with something like this:

$authenticationController = new AdminAuthenticationController;
$user = new User;
$user->username = $_POST['username'];
$user->password = $_POST['password'];
$user->rememberMe = true;
$authenticationController->login($user);

This means the login method is likely broken up in a few pieces inside the AuthenticationController. The Guest user’s login() method would always return false. The UserAuthenticationController would piggy back on AuthenticationController::login() by looking at the User::rememberMe variable and take it into account. But the AdminAuthenticationController doesn’t allow people’s logins to be remembered due to security reasons, so it doesn’t take that variable into account. And in that crazy case where there is nothing different about the admin login, the method would remain untouched (inherited from the parent), but any other changes (such as session length) would still kick in for the admin user with no additional coding.

All of this is done without modifying the core User class. The user class remains clean for its own further abstraction possibilities. New fields such as profile, name, DOB, etc., could be added with no modifications to the controller.

Yes, my version requires the most lines of code, but it is also the easiest to maintain and understand. Why? Because it isn’t cluttering up the User data class with methods that have nothing to do with the user. If you’ve ever written a generic “user” class, you know how large and cluttered such a class can become when you start piling in the methods for login, logout, preferences, session management, lookups, and other needs. I haven’t even talked about the fact that virtually all “operations” that involve a user also involve the database, which adds its own headaches. Being able to keep the hard work in other more data-manipulation-oriented classes is for your own good.

If down the road, it is determined that logging in should also require pre-approved IP addresses, what will your code need? Will your login method need an IP address passed in too? Or will the IP address be generated inside the login method? In my example, I would update the core Authentication class and be done. What happens when a new requirement is added that requires that the login also passes a CAPTCHA test? What about when logins need to be logged in a flat file? What happens when we change logins to use the email field instead of the user field?

Today, I only talked about logins. The ideas I propose here are not new; they are simply good ideas that get ignored by web developers. Remember, you’re application developers that happen to work in a browser. Don’t think that regular application design principles don’t apply to you: they do, more than ever.

Getting Around Overwriting form.submit()

Since my dear reader Sameer requested it, I’m here making an update. I’ve got a cool JavaScript fix for everybody! I mentioned in a post a long time ago, but JavaScript has this semi-unexpected “feature” where you can accidentally overwrite the submit() function from a form. As in:

<form id=”myform”>
<input name=”submit” value=”submit me” type=”submit” />
</form>
<script>
document.getElementById(‘myform’).submit(); // THIS FAILS – Object not a method
</script>

Apparently, by creating a form element called “submit” you overwrite the native function that exists in every form element in JavaScript. Because it’s native, it also means you can’t just willy-nilly redefine it. And to make things worse, you cannot (at least not in a cross browser manner), successfully re-assign the submit() method because some browsers will disregard any attempt to reassign its value. As in:

<script>
document.getElementById(‘myform’).submit = ‘This gets ignored’;
</script>

Fortunately, there is a fix. This fix requires modifying the actual DOM. Because this tends to be inconsistent across browsers, I’m doing this fix in MooTools (which is my JS library of choice). However, the fix is fairly straight forward and can easily be done with (or without) any JS framework, as you will see. The steps are:

  1. REMOVE the form element in question. This is an absolute requirement to make the solution cross browser compatible. This can be skipped, but it will cause quirks. However, the good news is that we can assume that 99.99% of all form elements named “submit” are due to designers being ignorant — thus, such cases are exclusive to submit buttons. Luckily, these are almost NEVER needed in the server side code and really just act as wall flowers.
  2. Check if step #1 completed successfully
  3. If it did not, create a new Form element and copy its submit function over
  4. Submit

The code looks like this:

<script>
var formObject = document.getElementById(‘myform’);
// Removes the node
formObject.submit.remove();
// Functions don’t have tagName defined

if(‘undefined’ == (typeof formObject.submit.tagName)) {
    // create a form and assign its submit function
    formObject.submit = new Element(‘form’).submit;
}
formObject.submit()
</script>

Let me know if you encounter any problems.