Spam Classification

by Ted Highway.

Share
|
Homepage | Submit your article | Contact | TOS
More articles on spam and scam  

You are here: Categories » Internet » Spam and Scam

Through the use of classification techniques and forensic data gathering, we can identify specific spam groups. In some cases the identification can include a specific individual; in other cases, groups of e-mails can be positively linked to the same unspecified group. Forensic tools and techniques can allow the identification of group attributes, such as nationality, left- or right-handedness, operating system preferences, and operational habits.

Spam Organization

There are two key items for identifying individual spammers or specific spam groups: the bulk mailing tool and the spammer’s operational habits. People who send spam generally send millions of e-mails at a time.To maintain the high volume of e-mail generation, spammers use bulk-mailing tools.These tools generate unique e-mail headers and e-mail attributes that can be used to distinguish e-mail generated by different mailing tools. Although some bulk-mailing tools do permit randomized header values, field ordering, and the like, the set of items that can be randomized and the random value set are still limited to specific data subsets.

More important than the mailing tool is the fact that spammers are people, and people act consistently (until they need to change).They will use the same tools, the same systems, and the same feature subsets in the same order every time they do their work.

Simplifying the identification process, most spammers appear to be cheap. Although there are commercial bulk-mailing tools, most are very expensive. Spammers would rather create their own tools or pay someone to create a cheaper tool for them. Custom tools may have a limited distribution, but different users will use the tools differently. For example, Secure Science Corporation (SSC), a San Diego, California-based technology research company, has a unique forensic research tool that generates a unique header that is used in a unique way, which in many cases, makes it easy to sort and identify e-mails.

There are many different types of spam. Identification of an individual or group from this collection is very difficult. But there are things we can do to filter the spam. For example, a significant number of these spam messages have capital-letter hash busters located at the end of the subject line. So, we can sort the spam and look only at messages with capital-letter subject hash busters.

By sorting the spam based on specific features, we can detect some organization. We can further examine these e-mails and look for additional common attributes. For example, a significant number of spam messages have a Date with a time zone of -1700. On planet Earth, there is no time zone 1700, so this becomes a unique attribute that can be used to further organize the spam.

Based on the results of this minimal organization, we can identify specific attributes of the spammer:

■ The hash buster is nearly always connected to the subject.

■ The subject typically does not end with punctuation. However, if punctuation is included, it is usually an exclamation point.

■ The file sizes are roughly the same number of lines (between 50 and 140 lines—short compared to most spam messages).

■ Every one of the forged e-mail addresses claims to come from yahoo.com.

■ Every one of the fake account names appears to be repetitive letters followed by a number. In particular, the letters are predominantly from the left-hand side of the keyboard.This particular bulk-mailing tool requires the user to specify the fake account name.This can be done one of two ways: the user can either import a database of names or type them in by hand. In this case, the user is drumming his or her left hand on the keyboard (bcvbcv and cxzxca indicate finger drumming). With the right hand on the mouse, the user clicked the Enter key. Since the user’s right hand is on the mouse, the user is very likely right-handed.

Although this spammer sends spam daily, he does take an occasional day off— for example,Thanksgiving, New Year’s Eve, the Fourth of July, a few days after Christmas, and every Raiders home game. Even though this spammer always relays through open socks servers that could be located anywhere in the world, we know that the spammer is located in the United States. We can even identify the region as the Los Angeles basin, with annual travel in the spring to Chicago (for one to two months) and in the fall to Mexico City (for one to two weeks).
The main items that help in this identification are:

■ Bulk-mailing tool identification This does not necessarily mean identifying the specific tool; rather, this is the identification of unique mailing attributes found in the e-mail header.

■ Feature subsets Items such as hash busters (format and location), content attributes (spelling errors, grammar), and unique feature subsets from the bulk-mailing tool.

■ Sending methods Does the spammer use open relays or compromised hosts? Is there a specific time of day that the sender prefers?

The result from this classification is a profile of the spammer and/or his spamming group.

Classification Techniques

After we identify and profile individual spam groups, we can discern their intended purpose.To date, there are eight specific top-level spam classifications, including these four:

■ Unsolicited commercial e-mail (UCE) This type is generated by true company trying to contact existing or potential customers.True UCE is extremely rare, accounting for less than one-tenth of 1 percent of all spam. (If all UCE were to vanish today, nobody would notice.)

■ Nonresponsive commercial e-mail (NCE) NCE is sent by a true company that continues to contact a user after being told to stop.The key differences between UCE and NCE are (1) the user initiated contact and (2) the user later opted out from future communication. Even though the user opted out, the NCE mailer will continue to contact the user. NCE is only a problem to people who subscribe to many services, purchase items online, or initiate contact with the NCE company.

■ List makers These are spam groups that make money by harvesting email addresses and then use the list for profit, such as selling the list to other spammers or marketing agencies.

■ Scams Scams constitute the majority of spam.The goal of the scam is to acquire valuable assets through misrepresentation. Subsets under scams include 419 (“Nigerian-style” scams), malware, and phishing.

Phishing

Phishing is a subset of the scam category. Phishers represent themselves as respected companies (the target) to acquire customer accounts, information, or access privileges.Through the classification techniques just described, we can identify specific phishing groups.The key items for identification include:

■ Bulk-mailing tool identification and features

■ Mailing habits, including, but not limited to, their specific patterns and schedules

■ Types of systems used for sending the spam (e-mail origination host)

■ Types of systems used for hosting the phishing server

■ Layout of the hostile phishing server, including the use of HTML, JS, PHP, and other scripts

To date, according to SSC, there are an estimated four dozen phishing groups worldwide, with more than half the groups targeting customers in the United States.

Leave a comment or ask a question
Total comments: 0

Spam and Scam Disclaimer

  • The e-articles directory is not responsible for any and all copyright infringements by writers and authors. If you suspect the information contained by this page for any copyright infringements, please contact us to investigate the issue
SPAM: What is it and why do I get so much spam - With so many unsolicited advertisements pouring into e- mailboxes all over the world, consumers are all asking the same questions. Where did they get my e-mail address? How do I know who to b (more...)
Antiphishing Legislation - Federal and state governments have earnestly begun to initiate legislation to formally address phishing. Several government privacy watchdog committees, such as CDT and NASCIO, have become very (more...)
Cyber Crime Evolution - Chances are high that you have received a phish in your e-mail within the few months or even last week. The operations that involve phishing scams will have accelerated due to aggressive malwar (more...)
What Is Phishing - Phishing, also known as carding or brand spoofing, has many definitions; we want to be very careful how we define the term, since it is constantly evolving. Instead of a static definition, let&r (more...)
Phishing Statistics - During the last three months of 2004, phishing in general took on a more organized direction. Phishers have refined their attacks, both in e-mail and malware, and have begun to target specific s (more...)
PYRAMID SCHEMES - The idea behind a pyramid scheme is to get two or more people to give you money. In exchange, you give them nothing but the hope that they can get rich too—as long as they can conv (more...)
CREDIT CARD FRAUD - While many people worry about typing and sending credit card numbers over the Internet, the reality is that few credit card numbers are stolen off the Net. Not only would a potential thief need (more...)
HOW TO PROTECT YOURSELF AGAINST SCAM - To protect yourself, watch out for the following signs of a scam: Promises of receiving large quantities of money with little or no work. (more...)
Operating Systems Used by Crackers - Everyone that uses computers will most likely develop a preference for a particular operating system. In my opinion, you should use what works best for you. There are arguments good and bad f (more...)
WORK AT HOME BUSINESSES SCAM - Besides pyramid schemes, many people receive messages offering them fabulous moneymaking opportunities that can be done at home. Here are some typical scams. Stuffing envelopes (more...)

 
free content
    Copyright © 2006 - 2012 e-articles.info.
The texts, articles and tutorials in the directory are property of their respective owners and authors.