Hacker Newsnew | comments | show | ask | jobs | submit login
PHP: md5('240610708') == md5('QNKCDZO') (3v4l.org)
222 points by dbrgn 19 hours ago | 158 comments





I'm not exactly clear on how PHP == works, but you can see the MD5 for yourself:

    $ echo -n 240610708 | md5sum
    0e462097431906509019562988736854  -
    $ echo -n QNKCDZO | md5sum
    0e830400451993494058024219903391  -
    $ echo -n aabg7XSs | md5sum
    0e087386482136013740957780965295  -
All of them start with 0e, which makes me think that they're being parsed as floats and getting converted to 0.0. This is why "magic" operators like == in PHP and JavaScript never should have existed in the first place. Operators like == should be, by default, extremely boring. PHP's just happens to be a bit more magical even than JavaScript's.

reply


Once I wrote a little PHP application to manage a clan in a browser game. I used an MD5 hash as session id that I checked with if(session_id)

When users started reporting that their logins would sometimes not work at the first time, I found out that strings that start with zero are coerced to 0 and then interpreted as false.

Never used PHP for anything important since.

reply


To be fair, this kind of thing (maybe not exactly this, but type-coercion bugs) can happen in JavaScript, which is all the rage now for "important" stuff.

reply


It can happen in a few languages, but PHP is notably more aggressive in trying to convert to int.

Actually a common way to grief new websites is to try to register '0' as a username. `if (string)` is a common way to check for null, and '0' will often fail.

reply


This is levels worse than what Javascript does though. Most high-level languages have some sort of implicit coercion (even python lets you do truth tests on non-boolean values). The problem here is the programmer isn't confused about types at all. They're comparing two things of the same type: two strings! Nevertheless, given two strings PHP tries to coerce them into ints before carrying out the equality test. Yes, you will have coercion bugs in other languages if you're testing things of different types, but I don't know any other language where a equality test between two things of the same type are automatically coerced into another.

reply


Yeah but javascript has 'use strict' whereas PHP decided that the easter egg "looks like you're using the wrong language!" was more important than actually allowing a 'use strict' to force === instead of ==.

reply


While that's true, JavaScript is still horribly error-prone because of this. The suggestion that JS would be a much better language if the == operator worked more like === in the first place is very reasonable.

reply


What did you use for the important stuff that was 100% predictable?

reply


zeroes and ones.

reply


This does not appear to the case in PHP 5.6, even for most strings with '==' gotchas:

  <?php
  if ('0e24') echo 'true'; else echo 'false';

  outputs:
  true
As far as I know the only strings that fail an if check are "" and "0". (Which is still a pitfall, but not one you'd hit with an MD5 hash)

reply


Yeah, documentation is for pussies.

reply


Here is one PHP core developer claiming that PHP documentation is wrong, even on fundamental things...

http://www.reddit.com/r/lolphp/comments/2md8c0/new_safe_cast...

Just saying....

reply


But it is not wrong in this case.

reply


> I used an MD5 hash as session id

> Never used PHP for anything important since.

The problem here isn't PHP, the problem here is you.

reply


You bought a new car. You took it out for a ride. a tree falls before you. You brake, but the car proceeded to hit the tree anyway.

You call the car company and talk to their engineers. One of them ask. 'Did this happen on a Friday evening, when it was raining?' You say 'Yes, how do you know?'

The engineer replies.

"Our brakes does not work on rainy Friday evenings. If you REALLY want to brake on a rainy Friday evening, you should also pull the lever under the dash board that is normally used to open the hood. It is very clearly printed on our manual. Didn't you read it? Our car is not the problem. You are the problem"

You were enlightened. You came back home. You never took the car out on rainy Friday evenings. When Somebody asks about the car, You said. "Yea, it is a great car. But you got to know how to use it".

You took great pride in knowing how to drive this car, which can easily kill someone who hasn't read the manual. When you hear that someone got killed while driving this car, you simply said. 'That car is Ok. but you should really know how to drive it, sadly this guy didn't. He was the problem, the car ain't...

reply


From now on, when I write “RTFM,” I will also link to this comment.

reply


> You bought a new car.

There's your problem, wasting money on something that only depreciates in value. Tsk tsk.

reply


sure, everything should be done perfectly or not at all ...

reply


We can accept that perfection may be impossible, difficult to obtain, or a poor tradeoff against other factors.

But that doesn’t mean that all imperfect designs are of equal merit.

reply


Nah, the problem is PHP.

See: http://blog.codinghorror.com/falling-into-the-pit-of-success...

> When you write code in [PHP], you're always circling the pit of despair, just one misstep away from plunging to your doom.

reply


I'd be willing to say this is true for any language in varying ways.

reply


But to widely varying degrees. This kind of problem is a direct consequence of having a relatively weak and dynamic type system (or other semantics that mean you might as well have).

Plenty of people have warned about this kind of danger for a very long time. However, there seems to be a significant subset of the web development community that only has experience with languages like JS and PHP and to a lesser extent other dynamic languages like Ruby and Python, who simply fail to realise how many of these bugs should have been entirely prevented by using better tools by now. The usual counter seems to be something about unit tests, at which point anyone following the discussion who actually knows anything about type systems and the wider world of programming languages dies a little inside.

It is entirely fair to criticise bad tools for being bad, particularly in specific ways and with clearly identified problems that can result as in this case. It's bad enough that we are stuck with JS for front-end web development these days, but there aren't many good arguments for using something as bad as PHP on the back-end in 2015.

reply


And what is gained from doing so? We must remain critical.

reply


No, the problem is using a shitty function like MD5 for any practical purpose.

reply


I didn't hear him blaming PHP. Defensive, are we?

reply


He didn't blame PHP, just never used it again for anything important. Did we read the same comment?

reply


This, combined with the fact that you can increment strings gives some 'interesting' results:

    $a = "2d9"; 
    $a++; 
    echo $a . "\n"; 
    $a++; 
    echo $a . "\n"; 
Output

    2e0
    3

reply


This is wonderful, I love it!

Interestingly [1], this echoes "2e0" followed by "3" in hhvm-3.7.0, but "3" followed by "4" in hhvm-3.6.0.

[1] http://3v4l.org/sJhP8

reply


That means that someone was using this "feature" in a relatively core piece of code from the PHP ecosystem. Enough that hhvm felt they needed to support it.

reply


There is some nasty type conversion going on here, from the type of stochastic random throws of two nine-sided dice to floats to integers. Where is your type preservation, PHP?

reply


Is there any way to defend against this one? I know === to turn off type conversion with the equality operator, but what about here?

reply


If condition check type and not try to increment a string...

if (!is_string($notstr)) ++$notstr;

edit:

I think checking for integer is better, if it's just incrementing integer.

I was going to say type hinting but I just realized php's primitive cannot be type hinted.

reply


Who tries to increment strings anyway? What is your point here?

reply


Given how happy PHP is about converting strings to integers on demand, it would be pretty easy to take string input intended to be a number, forget to actually convert it, and go around using it happily until one day you accidentally set off a bomb.

reply


Yeah, who would do that? Nobody. So why the _ is it possible in the first place?

reply


because PHP is dynamically typed, it's easier to accidentally increment a string.

reply


I tried replacing == with === and it gave me bool(false) in all cases.

Here's the code: http://3v4l.org/15hr7

reply


Ahh PHP, the language where true == false

    php > if ((true == "foo") && ("foo" == 0) && (0 == false)) echo "yay!";
    yay!

reply


This truly just bummed me out :(

reply


Don't let it - he understands exactly how the types are being converted in order to make it appear that true == false.

This sort of thing happens in type conversion languages. You can either use === to stop conversion or you can understand how conversion works.

I'm not sure how the order of conversions is decided by PHP, but here's a brief explanation:

Compare "foo" to true. Convert the string "foo" to a boolean value. As it is desirable that a non-empty string evaluate to true, we will say they are "equal."

Compare "foo" to 0. Convert the string "foo" to a numeric value. As "foo" does not start with 0x it cannot be hex, and as it does not start with 0 it cannot be octal, so evaluate it as decimal - there are no numbers before the first letter so the string foo, when numerical, is 0.

Evaluate 0 to false. Well, that's just binary now isn't it? Of course false and 0 are equal!

The moral of the story, == is not "exactly equal" it is "relatively equal."

reply


The problem with designing a language that does these sorts of implicit type conversions is that the "equality" operator violates the fundamental properties of equality. Since grade school mathematics we are all taught that equality is symmetric and transitive, and PHP's == operator is neither.

reply


"This sort of thing happens in type conversion languages. You can either use === to stop conversion or you can understand how conversion works."

Even JavaScript isn't insane enough to somehow coerce a string to 0.

reply


Yes, JavaScript will convert strings into numbers

    console.log(5*"12");
    60
    console.log(5*"0x0C");
    60

reply


Actually, sometimes type conversion make some code become a little bit handy.

We use Java at the backend and of course Javascript for frontend. When serializing, in Java we should

        String dataRaw = "42";
        int objectId = Integer.parseInt(dataRaw);
Meanwhile, in JS, it is fairly simple:

        dataRaw = "42";
        var objectid = +dataRaw;

reply


12 is not 0.

reply


Yes and no. JavaScript gives you the good ol' "NaN" which is a number (despite not being a number).

PHP doesn't have that concept in it.

reply


You can either use === to stop conversion or you can understand how conversion works.

It has been my experience that, to a first approximation, no-one fully understands how conversion works in such languages to the point of never getting it wrong in practice.

Of course we didn't know that would be the case when some of these languages were first created, but I think it is a compelling argument for making an actual-equality == operator the default in any new programming language design. There are enough plausible differences because of things like reference vs. value semantics already, without breaking basic intuitions about what comparisons mean as well.

reply


I've never seen one, but somewhere there must surely be a PHP version of the infamous 'WAT' talk about JavaScript, full of examples like this and the "2d9"->"2e0"->3 example mentioned by lars.

reply


There is a blogpost: http://eev.ee/blog/2012/04/09/php-a-fractal-of-bad-design/

reply


Same goes for the `0E` prefix with an uppercase E

The likelihood of generating a hash value with that kind of prefix is 2 in 65536.

Finding a collision `hash(a) == hash(b)` with this "weak" equality comparison is approximately 1 in 256 if I'm not mistaken.

reply


You are mistaken: my guess is that you are taking the square root because of the birthday paradox, but that is incorrect, and the birthday paradox does not apply here anyway.

The probability of generating a hash with the right prefix is 10 in 16^3, or about 0.25%. Finding a 0e... == 0e... collision has probability ~6e-6, if both inputs are random. The chance that two hashes collide in this way given N random inputs is 1-(1-p)^(N-1), for N>0.

reply


> The likelihood of generating a hash value with that kind of prefix is 2 in 65536.

The prefix is not sufficient though, the suffix must be entirely decimal otherwise it's not a valid number in scientific notation.

reply


The prefix is sufficient. Any hash matching /0e[0-9].*/ works.

reply


As far as I can see the prefix is not sufficient, a single non-digit character in the tail fails the conversion (and the equality check): http://3v4l.org/ctASF (vs http://3v4l.org/5FvJu, exact same strings but for the last character replaced by a digit)

reply


Which means the probability of generating a hash value of the form 0[eE][0-9]{30} is (1/128)(10/16)^30 or 5.9e-9.

It certainly reduces the strength of the hash (and MD5 shouldn't be used anymore in any case), but still a roughly 6 in a billion chance of someone choosing e.g. a password and it happening to be exploitable in this manner.

reply


From the manual:

  > The value is given by the initial portion of the string.
  > If the string starts   with valid numeric data, this will
  > be the value used. Otherwise, the value will be 0 (zero).
  > Valid numeric data is an optional sign, followed by one
  > or more digits (optionally containing a decimal point),
  > followed by an optional exponent. The exponent is an 'e'
  > or 'E' followed by one or more digits.
Also:

  > If you compare a number with a string or the comparison
  > involves numerical strings, then each string is converted
  > to a number and the comparison performed numerically.

reply


You're right:

% php -r 'var_dump("0e1" == "0e2");' bool(true)

reply


I don't think I'd use the word "magic" here, because it implies that == works when, by any reasonable standard, it does not.

reply


Type coercion is fine so long as you recognize it as the syntactic sugar that it is. JS and PHP support easy type coercion because HTTP is string-only and it would be a pain in the ass to explicitly cast every value you get over the wire. You just have to be sure that, when you use it, you do so intentionally and not out of laziness.

reply


Here's what I took away::

It's pain in the ass to validate and sanitize your input.

reply


So, you should use the old shell trick of adding an "X" to the front of the strings before comparing?

reply


Or use === instead of ==.

The PHP developers have been pretty honest about the mistakes they made early on because they didn't know better. Unfortunately, many of those mistakes persist. The difference between == and === is one of the more well-known mistakes.

reply


This is well-known PHP-trick. Use === to right result.

  php > var_dump(md5('240610708') == md5('QNKCDZO'));
  bool(true)
  php > var_dump(md5('240610708'),   md5('QNKCDZO'));                                                                                                                                                    
  string(32) "0e462097431906509019562988736854"
  string(32) "0e830400451993494058024219903391"
  php > var_dump(md5('240610708') ===   md5('QNKCDZO'));                                                                                                                                                 
  bool(false)
  php > var_dump("0e462097431906509019562988736854" == "0e830400451993494058024219903391");
  bool(true)
  php > var_dump("0e462097431906509019562988736854" === "0e830400451993494058024219903391");
  bool(false)
  php > var_dump(md5('240610708') ===   md5('QNKCDZO'));                                                                                                                                                 
  bool(false)
  php > var_dump(md5('240610708') ==   md5('QNKCDZO'));                                                                                       
  bool(true)
  php > var_dump(md5('240610708') === md5('QNKCDZO'));
  bool(false)

reply


> This is well-known PHP-trick. Use === to right result.

Everybody knows PHP is a trickly-typed language. Read the docs people or PHP will take advantage of your gullible ass.

reply


perhaps ==== operator must reserved

reply


php_real_equivalence_4()

reply


Absolutely! However, we must be careful not to define it in a too predictable way lest we violate the Principle of Most Surprise.

reply


PHP's type coercion is nothing like I have every seen in any other language. Its horrendously messy, ugly and completely inexcusable. Strings type-casted to integers are 0. Seriously? Take a look at this,

> $arr = array(0, "was", "invented", "in", "india");

> var_dump( in_array("Hello", $arr ) );

and yeah it is TRUE because "Hello" got coerced to 0. I blogged about a major bug, I faced, in PHP, where column name "10th_grade" was being type-casted to "10" failing the "bindParam" [1]. Even if they have to continue this "feature" because of backwards compatibility, the least they could have done was NOT to use it in the newer functions but no, even they have this stupid "type juggling".

[1]: http://coffeecoder.net/blog/my-perfect-reason-avoid-php-type...

reply


There are a couple of things we have learnt in our collective 50+ years of software engineering:

1. Code is not English: Nice try COBOL, and someone had to try, but a failed experiment. Bizarre holdouts: SQL

2. People are not idiots, and will not collapse into a gibbering heap if their programming language insists that 0 and "0" are different things and must be managed accordingly. Bizarre holdouts: PHP, Javascript. Honourable mention: Excel (no Excel, that is not a f&@cking date, I will tell you if I want a date).

reply


> 2. People are not idiots, and will not collapse into a gibbering heap if their programming language insists that 0 and "0" are different things and must be managed accordingly.

This. People are not idiots, they're learning. By making your language assume programmer is an idiot you're making it more difficult for said programmer to form a coherent mental model of what's going on.

reply


Bizarre holdouts: SQL

I think SQL is actually one of the better implementations of this idea. It's a bit verbose, but I don't think it's tripped up people in the same way that PHP and JS do.

reply


SQL is great for a very specific job: talking to a database. If you try to do anything else in it, you end up in a horrible mess (e.g. cursors).

Luckily, people rarely try to do anything difficult in SQL, because they are using another language and dropping into SQL to talk to their database. This can lead to inefficient code, depending on the API/SQL engine, but it means people end up with sane code (unless their other language is PHP, of course.)

reply


Absolutely, to be fair to JS, Eich admitted it was an horrible mistake, and tools like JSlint enforce the use of === .

I didn't see any meaculpa from the PHP team yet.Would like to read about it.

reply


Yes. To be strictly fair, both JS and PHP have legitimate excuses; JS because it was done in an insanely short timescale, PHP because it was (initially at least) cobbled together by an amateur for his own purposes. I doubt anyone could have predicted that both languages between them would basically be running the planet by 2015 :)

reply


> in_array(.., .., $strict)

I think you're aware of the third parameter but for anyone who reads this post, it disables the type coercion of the in_array call.

reply


Spoiling a good rant with facts.

reply


The fact that you need to specify a third, optional parameter to get sane output out of a really basic function is still pretty rant-worthy.

reply


But beware, with strict comparison: 1 ≠ 1.0 (because int ≠ float).

reply


But... wouldn't you expect that?

I mean, they AREN'T the same value really.

reply


Agree, they aren't the same type to begin with.

reply


    PHP's type coercion is nothing like I have
    every seen in any other language. Its
    horrendously messy, ugly and completely
    inexcusable. 
Is it objectively worse than type coercion in JavaScript?

reply


Oh yes. In Javascript, the operands are only coerced if one of the operands is a number. So when comparing two strings (regardless of whether the strings can be interpreted as a number), you always get a regular string compare.

  "12" == "12.0" -> false, basic string compare
Furthermore, if one operand happens to be a number, and the other operand has illegal characters to be interpreted as a number, the two operands aren't equal.

  0 == "foo" -> false, "foo" is not a valid number
  12 == "12 monkeys" -> false, "12 monkeys" is not a valid number
  12 == "12.0" -> true, "12.0" is a valid number, and compares equal to 12.
In short, in Javascript the == operator actually makes sense. In PHP, every single one of above examples would evaluate to true.

reply


While it is obvious that PHP's == operator is horrible, JavaScript has its share of pretty bad issues, like "x" - 1 giving NaN.

What I don't understand is why some people agree that PHP is a horrible language, while at the same time praising JavaScript as messiah of scripting. These two languages don't just have problems, they have very similar problems. Moreover, they gained popularity for very similar reasons (lack of choice).

Seriously, if you posted something similar to OP about JavaScript the first thing people would tell you is "What, you're still not using ===?!"

reply


PHP: var_export(0 == "hello"); // true

JavaScript: console.log(0 == "hello"); // false

reply


Actually, I was hoping for something more than a single example.

Or, did you mean that PHP and JavaScript were neck-and-neck all the way up to that one example, and ultimately it's the very one that proves PHP's type coercion is worse?

reply


You're in a thread about how PHP's type coercion can easily cause a serious vulnerability. So, the title of this thread is your second example. If you want a third example, find it yourself.

reply


Give me a break. 3 examples isn't enough to answer the question. Your comment history shows you ask questions in lieu of doing your own research. If you don't want to take the time, then move on.

reply


Amusingly enough the Puppet current parser does this too because it has some weird form of type juggling. This has been fixed in the future parser, which actually has a type system too :).

reply


PHP's == has a lot of oddball effects. They were put in so that things would behave the way a novice expects them to (3 == '3') but would confuse more experienced programmers, or those coming from other languages.

Unless you're deliberately taking advantage of automatic type conversion and whatnot, you should probably use === by default.

reply


> They were put in so that things would behave the way a novice expects them to (3 == '3')

It's a very wrong approach. It may look like newbie-friendly, but in fact it makes it much harder to learn and use. Any novice will be constantly attempting to form a mental model of what's going on and how the language interprets concepts. Refusing to do things like 3 == '3' is simple and makes sense. Assuming a programmer is an idiot and trying to outguess his mistakes makes the language so complicated, that the novice will not be able to form a coherent model and will most likely assume that "this thing is magic".

reply


It's hard for newbies who want to master the language. It's not hard for people who have no interest in learning a programming language and just wan't to make the thingy in their HTML do some stuff.

Register globals,

    <?php
        if ($category == 2) {
            echo 'Foo';
        }
    ?>
and be done.

We have to remember the PHP origins and audience from way back to understand why this was considered easy to use.

reply


That's actually interesting. It's not obvious to me that "2" should be parsed as an int and not a string. Perhaps we should either be explicit about what we want "2" to be parsed as (int, long, float, double, bigint, bigfloat, string...) or let the parsing of a number be determined in a more dynamic way. If you're comparing a string with an integer literal, then you probably want the string interpretation of the literal, right?

Not that this is particularly important, I guess.

reply


We are pretty sure what the literals mean. On the other hand we have many string channels: get/post/cookie/persistent storage¹/… Given that environment its probably natural that you try to convert a string into its "intended" type.

¹no DB, but the "just write your visitor counter into a plain text file" back then

reply


>We are pretty sure what the literals mean.

News to me. You have to enter a really high-precision number as a string in Java so it won't be rounded off to fit within a double. This is an unsolved problem.

reply


shhhhh, people don't realize PHP started out as just a tool for Rasmus and ended up evolving. No, to them, PHP was DESIGNED this way on purpose from the ground up.

reply


Do you consider that an acceptable excuse for its behaviors fifteen years after the fact? Because I do not.

reply


Of course not, but everyone keeps comparing PHP to languages that were designed and developed to be languages, not a toolset that some crappy developer (his own words) created for his personal site that ended up evolving and becoming a real language.

It's got quirks, we get it. Let's keep improving the language as we go instead of constantly bashing it. I mean PHP is one of the most widely used languages on the web today.. Clearly it's doing something right.

reply


People compare PHP to other languages regularly used in 2015 for web development. In that light, it compares very poorly.

McDonald's is super popular, too, and deserves even more of a ration of shit than they get for feeding people slop.

reply


Designers of future languages, please take this example as a proof of the rule: don't design anything for newbies. They will find a way to make an error anyway, but dumbs-based design will be the problem for everyone else.

reply


That - and the error is going to be a lot more subtle and harder to find.

In all fairness though, it's a balancing act - There are benefits to dynamic typing, but PHP clearly overdid it. (See also the disaster that was/is magic quotes)

reply


I believe you are confusing dynamic with weak typing. Other languages got dynamic typing quite right.

reply


And in C we can do this to get TRUE:

    return (33 == '3');
:P

reply


Incorrect. However, (0x33 == '3') will return true, as will (51 == '3'). Your point is valid, even if your code is wrong. Automatic type coercion can produce unexpected results in any language.

PHP's automatic type coercion rules are designed to help newbies at the expense of experienced developers. C's automatic type coercion rules are, largely, designed to expose the underlying memory layout to developers who know what they're doing, at the expense of inexperienced developers. Both can easily contain dangerous pitfalls, but I prefer the latter philosophy over the former.

(Disclaimer: I have built a career as a C programmer and frequently use its lower-level features to great advantage. I am biased.)

reply


And there are no type-coercion rules at play in case of 51 == '3', because type of '3' is int (as per ISO 9899 p. 6.4.4.3.2).

reply


Excellent point. Thank you for the clarification. This is true in c99 as well.

reply


Okay, I stand completely corrected.

reply


> you should probably use === by default.

unfortunately this can also backfire if your class/module is used in a different context where it gets strings instead of integers and you were just using === without really thinking about it:

We had a case where the code was something like:

  function doSomething($value) {
    if ($value === 0) {
      //do something
    } else {
      //do something else
    }
  }
This was then used in a slighly different context where $value was a string '0', it then ended up incorrectly in the //do something else block, doing the completely wrong thing. In this case the type co-erced == would have been better, and I think what the developer was expecting would be a type error due to the === but it's not a type error, it'll just fall into the else block.

reply


You're describing the expected behavior of === and a bug.

This is not the === operator "backfiring."

reply


Absolutely, I was just suggesting that "you should always use the === operator" advice which I see a lot of people say(examples multiple times in this thread), does not guarantee you won't run into problems with incorrect types, and giving an explanation.

As always, you should be thinking when programming.

reply


As a rough generalization, all PHP code that involves "==" and "!=" should be considered broken.

PHP introduced "===" and "!==" a long time ago, and every programmer should know that they have to use that, without any excuses.

Also, don't use "in_array($a, $b)", but use "in_array($a, $b, true)" instead.

reply


>without any excuses...

Oh yea? How about this,

http://www.reddit.com/r/PHP/comments/2zhg6z/how_true_is_this...

reply


I don't see how "==" would help in that situation, other than "solving" this particular issue by opening another can of worms.

You simply can't use php arrays for user-generated keys in a safe manner. At least you have to add some prefix like '_stuff_' to all keys, to avoid accidental conversions. And yes, this "proper" solution (Can you ever can say "proper" in php? Anyway ...) doesn't have to involve "==", but works perfectly (and preferably) with "===".

reply


So what you're basically saying is that the "standard" variations and APIs which people will find and use are broken, and the ones actually working are hidden somewhere in the documentation. And you're saying you think this is just fine?

In that case, I have a hammer to sell you, and I think you know which one.

http://blog.codinghorror.com/the-php-singularity/

reply


> And you're saying you think this is just fine?

Not sure where you read this. I didn't provide any judgement of the situation.

Strawman arguments like this should have no place on HN.

reply


How about ==== and ===== and ======?

For security reason, I suggest PHP to implement such operators... :D Example:

"abc" === 'abc'; # ==> true

"abc" ==== 'abc'; # ==> false, single-quote vs double-quote

"abc" ===== 'abc'; # ==> true, this is how it works

j.k :D

reply


Reminds me on bash, where I also have to prefix values to compare with x, to be able to handle empty vars.

    if [ x$1 == x$2 ];
But automatic string to float conversion is just crazy, esp. in comparison context. Perl, which is equally soft, has at least numerical and string comparison operators.

    $ perl -e'print "0e462097431906509019562988736854" ==
                    "0e830400451993494058024219903391"'
    1
    $ perl -e'print "0e462097431906509019562988736854" eq
                    "0e830400451993494058024219903391"'
So the solution is to use === which does not compare references with strings but the values, or the strcmp function. And refrain from using == with strings at all. '0XAB' == '0xab' is true. Comparing any string to 0 with == will return true.

reply


I'm not sure why does this crazy "x" prefix tale still continue. You can simply quote them instead. Especially if you use bash and not some other sh-compatible shell:

    if [ "$1" == "$2" ];
will work just fine.

If you need all sh compatibility, it should be test for "x$1" anyway (still quoted).

reply


I think you meant “=”, not “==” (though the latter would work with bash).

reply


Well, either in the example. Parent was saying "Reminds me on bash"

For sh version, I'd go with super-safe:

    if test "x$1" = "x$2"

reply


If you're doing that, even better to use "x${1}" to be safer. Also, conditional expressions ( [[ instead of [ or `test`) are generally a bit more well-behaved. See http://wiki.bash-hackers.org/syntax/ccmd/conditional_express... for more info.

reply


But [[ is a bashism - it won't work on bare sh.

reply


Actually, you don't prefix with “x” to handle empty vars, but special characters, as Stephane Chazelas recently reminded: http://www.zsh.org/mla/workers/2015/msg00797.html

reply


Again here conditional expressions should make this a non-issue ( [[ instead of [ ) since the stuff inside doesn't get parsed the same as general input. See http://wiki.bash-hackers.org/syntax/ccmd/conditional_express...

reply


Yes, but then you need either bash or zsh. It won't work on bare sh (or on dash, which is the default /bin/sh on Debian and derivatives like Ubuntu).

reply


http://eev.ee/blog/2012/04/09/php-a-fractal-of-bad-design/#o...

reply


http://forums.devshed.com/php-development-5/php-fractal-bad-...

reply


(Shameless plug) http://blog.hackensplat.com/2012/04/php-some-strings-are-mor...

At which point in this article do I start making stuff up about PHP's comparison operators?

reply


> If you want to compare two strings that are the same except they each use different ways of expressing an 'é', you need to add another equal sign and use ==== to differentiate them, as === will see them as equal.

reply


The fact that PHP is a dynamic language and that "==" would automatically convert the types of both ends to a flat because of the "0e" prefix of the string is problematic. Perhaps it's a bug in the PHP source code.

See below.

		# the examples were essentially similar like this comparison.
		php > var_dump("0e462097431906509019562988736854" == "0e830400451993494058024219903391");
		bool(true)

		# md5() does return a string type, but just happens to start with "0e"
		php > var_dump(md5('240610708'));
		string(32) "0e462097431906509019562988736854"
		php > var_dump(md5('QNKCDZO'));
		string(32) "0e830400451993494058024219903391"

		# and if PHP treats them as floats instead of strings, they all evaluated to the same thing. float(0)
		php > var_dump(0e462097431906509019562988736854);
		float(0)
		php > var_dump(0e830400451993494058024219903391);
		float(0)
		php > var_dump(0e087386482136013740957780965295);
		float(0)

reply


One thing to note.

The md5 and sha1 interfaces have a second param which prevents this bug.

Instead of returning a string it will return binary data which won't get coerced to a float.

For example:

    <?php
    if (md5('240610708', true) == md5('QNKCDZO', true)){
        printf("Will never go here\n");
    }
PHP has a lot of.....PHPisms.

reply


There's no "binary data" type. Raw hash output can certainly start with bytes matching "0e" or "0E", it's just a lot more rare.

reply


PHP : 1 week is not always 7 days:

  $_1week = new DateInterval("P1W");
  $_7days = new DateInterval("P7D");
  var_dump($_1week == $_7days); // true
  var_dump($_1week);
  var_dump($_1week == $_7days); // false
  var_dump($_7days);
  var_dump($_1week == $_7days); // true
http://3v4l.org/CcAk8

Same result with '$_1week = new DateInterval("P7D");' :-)

reply


While I do believe that it is possible to write great Apps with PHP I tend to stay away from it because it is not statically typed. For quick and dirty proof of concept it is nice though (IMHO).

reply


I agree that all languages have it's warts and a good programmer should know about them.

I think what makes both PHP and Javascript not so great is the fact that it is so easy to overlook deadly mistakes like using "==" instead of "===" or forgetting to add a "var". And worst of all those errors can go unnoticed until something breaks and when it does it's pretty hard to find out the root of the problem.

reply


All of this I can understand, but why then octal numbers are not compared the same way is beyond me?

    var_dump(0xA == '0xA'); // bool(true)
    var_dump(012 == '012'); // bool(false)

reply


Check out which types those examples get cast to and it should make more sense :) I don't know the exact rules for type detection in PHP, but it looks like that's the cause.

reply


Just to make it clear, I did not come up with this example. Unfortunately I can't find out the source anymore. It also contained some technical explanations about why this works. So if anyone remembers, I'd be happy if you could comment with the link.

reply


Ah, here it is: https://twitter.com/spazef0rze/status/523010190900469760

reply


Author of the original tweet here, thanks for sharing! Here's the link to the "original original" MD5 tweet https://twitter.com/spazef0rze/status/439352552443084800

For similar tricks for SHA-1 and plaintext see https://twitter.com/spazef0rze/status/523010190900469760

reply


sigh.. yes, == can be weird. get over it. any php dev worth anything knows to use ===

reply


It's appalling to think that there are 'developers' who are still only now realising PHP's horrendous nature.

reply


OK, how did this happen?

reply


PHP's `==` tries very hard (even harder than javascript's) to "please" the user. That means if if can it will fallback to converting both sides to numbers and compare that.

Here all hashes are of the form "0e{digits}" which is a valid scientific notation, so when `==` internally converts them to numbers they're all parsed to `float(0)` and therefore equal, success!

reply


usual story

== is not the same as ===

reply


not the usual story , == should be deprecated and a warning should be displayed. PHP has explicit coercion features, devs should use them alongside with === .

reply


I mean its the usual story when people post stuff about PHP comparisons.

http://php.net/md5

the example itself uses === although no advice why is given

reply


Here's a threaded app for finding such collisions: https://github.com/beched/php_hash_collision_finder

reply


There is also a weird casting when a string starts with a digit: var_dump(10 == '10xyz');

reply


This is why Strong Typing is so important !

reply


  php > var_dump("hello" == 0);
  bool(true)
  php >

reply


I looked at this and said "oh well, at least hhvm is consistent".

reply


Well.. http://3v4l.org/tT4l8

reply


So this is a PHP fail. But all the same, MD5 has been shown to fail collision resistance several times now.

reply


The problem is this is an issue with any comparison of hex digit strings. It's a possible issue with any hash function not just MD5.

reply


Yes md5 is broken. We've known this for quite some time.

reply


The problem is with PHP, not MD5

reply


This has nothing to do with md5 itself.

reply


it's not _that_ broken

reply




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: