I was trying to create some complicated Gmail filters. However, there doesn't seem to be any documentation of how the to and from fields work exactly. So I tried figuring it out myself...
General Matching Guidelines
The matching criteria is similar to Google's search. There is no word stemming, so you must enter full words (e.g. joh will not match john.smith@gmail.com). Not even plural stemming, like what Google search has, is supported (e.g. app will not match apps@example.com).
Word order does not matter, unless the words are enclosed in quotes (e.g. "smith john" will not match john.smith@gmail.com). Generally, symbols are ignored (for more information see the next section).
Words are split on everything except: letters, numbers, and underscores. The most common symbols that split words are +.@. This means that foo will not match foo_bar@example.com but will match foo+bar@example.com. The @ character itself is not considered a word and can be skipped over (e.g. "smith gmail" will match john.smith@gmail.com).
You can use the OR operator in addition to grouping () for some complex conditions.
Symbol Behavior
When you enter a symbol in the filter box, they usually behave differently:
- Symbols that act as
x y:~#$%^*+;",<>?and the grave character. For example,smith~johnbecomessmith john, which matchesjohn.smith@gmail.com. - Symbols that act as
"x y":-=\:'./-- For example,john-smithbecomes"john smith", which matchesjohn.smith@gmail.com. - Symbols that are treated literally:
&_-- For example,john_smithwill matchjohn_smith@gmail.com, but notjohn.smith@gmail.com. - Special symbols:
!@()[]{}|!:john!smithbecomesjohn -smith, which matchesjohn.foo@gmail.combut notjohn.smith@gmail.com.@:@is stripped out at the end of a word. For example,john@becomesjohn, which matchesjohn.smith@gmail.com.@is stripped out at the start of a word. For example,@foo.comwill becomefoo.com, which matchesjohn+foo.com@gmail.com.@in the middle of a word will generally require the full address for a successful match. For example,john.smith@gmailwill not matchjohn.smith@gmail.com. Additionally, symbols will be taken literally. For example, to matchjohn.smith@gmail.comyou must usejohn.smith@gmail.com... bothjohn-smith@gmail.comandjohn~smith@gmail.comwill no longer work.@in a different location in the middle of a word has strange behavior. For example, when trying to matchjohn.smith@gmail.com:john@smith@gmail@comdoes not matchgmail@comdoes not match@gmail@comdoes not matchsmith@gmail@comdoes matchsmith@gmail.comdoes not match"john smith@gmail.com"does not match"john.smith@gmail com"does not match
|acts as theORoperator.- Parenthesis act as grouping for
ORandANDfilters.
Other Matching Behaviors
The default account you use (e.g. john.smith@gmail.com) will match all variations of your address. This includes dot notation, plus addressing, and using the googlemail.com domain.
Here's a brief explanation of each:
- Using dot notation: You can enter as many non-consecutive dots in your email as you want. For example, if your email is
john.smith@gmail.com, mail sent toj.o.h.n.s.mith@gmail.comwill still arrive at your account. - Using plus addressing: After your account name, you can enter the
+sign and whatever text you want afterwards followed by the Gmail domain. For example, mail sent tojohn.smith+foo@gmail.comwill arrive atjohn.smith@gmail.com. - Using googlemail.com domain: Any mail sent to your
<your-gmail-account>@googlemail.comwill arrive at your@gmail.comaddress. For example, mail sent tojohn.smith@googlemail.comwill arrive atjohn.smith@gmail.com.
Any of the above can be combined (e.g. j.o.h.n.s.m.i.t.h+foo.bar@googlemail.com will still go to john.smith@gmail.com).
Interesting Consequences
- Can't match all dot versions of your Gmail address easily: If you're in the habit of giving out the
.version of your email address to prevent spam (e.g.j.ohn.smith@gmail.com), you cannot easily create a filter for all dot version of your address since these are split up into separate words (e.g.johnsmith). When you only use one variation of this, it's easy to create a filter and, for example, send it to spam. However, if you start using different variations (e.g.jo.h.n.smi.th@gmail.com) it causes different words in the address (e.g.johnsmith), forcing you to create a distinct condition for each variation you use. - The
+symbol is worse than the""operator when matching plus addresses: If you're trying to create a filter for a plus address, your best bet is to include the full address (e.g.john.smith+foo@gmail.com). If for some reason you aren't using the full address, the+operator is actually worse than the""operator. For example,john+foois worse than using"john foo", since the former will matchfoo@john.com. Keep in mind that the later is not bullet proof either, it will still matchfoo@john.foo.com. It just guarantees that the order is correct. For clarity, you could use"john+foo", but realize that it's the same as"john foo". - You must use negation to match all email sent to plus addresses: To filter on all plus addresses (e.g. to send them to spam), you should use the query
john.smith@gmail.com -"john smith gmail com". The first part of the query will match any plus addresses you have. The second will remove all those that don't have the words in the exact order. For example,john.smith+foo@gmail.comwill not match since it has the wordfooin between the other words. Note that there is one weird, and very unlikely, case where this won't work:john.smith+john.smith.gmail.com@gmail.com, since it does have the words in the specified order.
All important tools are already provided by Google. However, I'll be more glad if they will add an email encryption tool directly inside the gmail.
ReplyDeleteHaha, never realized how retarded gmail filtering really is... just want to match a [tag] prepended to subject field, fat chance ):
ReplyDeleteThanks for your article.
ReplyDeleteNevertheless, I don't find how to isolate an entire domain. For example, I would like to find all emails from @orange.com (xyz@orange.com and not xyz@[something]orange.com). How can I do that?