Infopost | 2015.03.22

A few months ago Steve and I noted that our respective sites were getting tons of hits from Samara Oblast, an obscure(?) territory in Russia. Russian search engine maybe? Cybercriminals? Proxy for the American or Chinese or Syrian electronic armies? Who really cares? Only port 80 should be open and doing nothing fancy

But since this kilroy thing has gotten pretty lengthy I was scoping the possibility of doing some sort of 'top content' thing based on hits. So I pulled my server logs and was looking through them to see how hard it'd be to parse.
Attack surface

Missile command screenshot

Well this is fun:

91.200.13.119 "GET /kilroy/archive/2008/04/index.html HTTP/1.0"...
91.200.13.119 "GET /kilroy/2008/01/leader-board-r.html HTTP/1.0"...
91.200.13.119 "GET /kilroy/2008/01/index.php HTTP/1.0"...
91.200.13.119 "GET /2008/01/index.php HTTP/1.0"...
91.200.13.119 "GET /kilroy/2008/01/index.php HTTP/1.0"...
91.200.13.119 "GET /2008/01/index.php HTTP/1.0"...
91.200.13.119 "GET /kilroy/2008/01/index.php HTTP/1.0"...
91.200.13.119 "GET /2008/01/index.php HTTP/1.0"...

How am I going to count hits for 2008/01/index.php when there is no anything.php?

Eight sequential hits from the same person, within 10 seconds. That's what I call quick on the mouse. Whois says it's from Ukraine. I'm going to stop me right here, this is my first time actually looking at http traffic, this is old hat to 80% of the world. Okay, let's continue.
Maybe they're just guessing about site map, but probably they're looking to have some fun with php.

Another interesting one:

POST /cgi-bin/php?
%2D%64+%61%6C%6C%6F%77%5F%75%72%6C%5F%69%6E%63%6C%75%64%65
%3D%6F%6E+%2D%64+%73%61%66%65%5F%6D%6F%64%65%3D%6F%66%66+%2D%64+%73%75%68%6
F
%73%69%6E%2E%73%69%6D%75%6C%61%74%69%6F%6E%3D%6F%6E+%2D%64+%64%69%73%61%62
%6C%65%5F%66%75%6E%63%74%69%6F%6E%73%3D%22%22+%2D%64+%6F%70%65%6E%5F%62%61
%73%65%64%69%72%3D%6E%6F%6E%65+%2D%64+%61%75%74%6F%5F%70%72%65%70%65%6E%64
%5F%66%69%6C%65%3D%70%68%70%3A%2F%2F%69%6E%70%75%74+%2D%64+%63%67%69%2E%66
%6F%72%63%65%5F%72%65%64%69%72%65%63%74%3D%30+%2D%64+%63%67%69%2E%72%65%64
%69%72%65%63%74%5F%73%74%61%74%75%73%5F%65%6E%76%3D%30+%2D%6E HTTP/1.1

Looking to do injection or overflow or something? Not really my wheelhouse, but it was kind of a fun digression.
Classifier

So I wrote some code to classify site traffic into one of the following categories:
Some of it was pretty easy, bots tend to declare themselves in the user agent string and hit robots.txt first. Malicious stuff sends PUTs and looks for files that aren't .html/.jpg/etc. And, of course, sequential traffic from the same IP can be classified together. This is important because an attack might hit numerous legit links but it's not visit traffic.
Data

Logs go back about a year. Here's some excel because easy.

Classification of web site hits

I get indexed about twice as much as I get visited. There have been more than 20,000 malicious http requests.

Web site bot hits histogram

Google, Baidu, and Majestic 12 (a distributed indexing project) turned up most. But there are quite a few bots out there.

So the top visited content, the main reason for this whole endeavor:

Pages
Images
Labels - which are now just links to search
Data skew: some content has been around longer. On the other hand, the logs are only from about a year back.

When I get some more fun-coding time I'll see about putting this in the sidebar.



Related - internal

Some posts from this site with similar content.

Post
2010.01.10

Favorite photos, 2010-2019

Here I present the photos from this decade that I like technically, aesthetically, or nostalgically. You may notice the post is at the beginning of the decade, I've chosen this as a convention so I can keep a running post for in-progress decades.
Post
2014.01.01

Popular

A gallery of my most popular photos, updated regularly.
Post
2015.01.19

Improvements

Static site generator changes, fantasy football, and some video games.

Related - external

Risky click advisory: these links are produced algorithmically from a crawl of the subsurface web (and some select mainstream web). I haven't personally looked at them or checked them for quality, decency, or sanity. None of these links are promoted, sponsored, or affiliated with this site. For more information, see this post.

krebsonsecurity.com

Ragebooter: Legit DDoS Service, or Fed Backdoor? Krebs on Security

ooer.com

Ooer

By Chris Neale
citizenlab.ca

Keep Calm and (Dont) Enable Macros: A New Threat Actor Targets UAE Dissidents - The Citizen Lab

This report describes a campaign of targeted spyware attacks carried out by a sophisticated operator, which we call Stealth Falcon. The attacks have been conducted from 2012 until the present, against Emirati journalists, activists, and dissidents.

Created 2024.04 from an index of 173,559 pages.