How to check whether a link is safe or not?

Question

I've been working on a task to find threats in a social network called Skyrock. For which, I've managed to retrieve a significant amount of URLs by crawling various user profiles and parsing the data for URLs. These URLs basically are a part of data, that some of the users share publicly on the network. Now, I want to check whether any of these URLs is malicious or not.

I know there are a lot of online url scanners available on the internet to scan for a malicious link, but don't want to use any of those. Instead, I want to use information which am able retrieve about a URL, using some third party API, for example: location (lat, long), url redirects, redirect count, DNS entries etc. Can any of these properties be used to check if a particular URL is malicious or not? What other information about a link do I need to check if it's malicious or not?

Note: A malicious link, in this case, could be a link to a phishing site, a spam link, or either a link that makes you download some malware. I just want to know what properties related to a link are useful in figuring out whether a link falls under these categories of malice or not.

http://urlquery.net reveal reason why it detect a url as bad. — Dog eat cat world, Nov 11 '13 at 11:18
Okay. But can I use the information that I already have to manually conclude whether it's malicious or not? — Rahil Arora, Nov 11 '13 at 11:21
What kind of malice are you looking for? URLs that attempt to get you to download malware? URLs that compromise the security of Skyrock? URLs that are phishing sites? — Manishearth, Nov 11 '13 at 11:27

Philipp · Accepted Answer · 2013-11-11T14:24:53.023

Online URL scanners have three methods to check URLs for being malicious:

Blacklists of known bad URLs. Someone reported that very URL as malicious and entered it into a database. These cases are trivial to check for the scanner but require constant effort to keep the database up-to-date. Considering that many malware-serving websites only exist for hours before they get removed by the hoster or get "un-hacked" by their real admin, they outdate very quickly.
Known malware samples. They download the linked website and then use a database of known web-based exploits to search for signature strings in it which hint that it is using one. The half-life time of such database entries is much longer than that of URLs, because there are some widely-used stock exploits which are deployed over and over again on thousands of URLs. This method does not help against zero-day exploits and might be fooled by self-written exploits for known vulnerabilities or running stock-exploits through an obfuscator.
Heuristics. By analyzing the HTML code and executing dynamic content in a sandbox they can find suspicious behavior and report it. Just like with normal virus scanners, the potential for false-positives is very high.

Note that the methods 2 and 3 aren't foolproof. The malicious website could try to detect that it is downloaded by the URL scanner and serve different content than it would serve to a regular visitor.

How to check whether a link is safe or not?

1 Answers1

Linked

Related