CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
rapid7

CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!

GitHub Repository: rapid7/metasploit-framework
Path: blob/master/data/msfcrawler/basic.rb
Views: 1904
1
##
2
# This module requires Metasploit: https://metasploit.com/download
3
# Current source: https://github.com/rapid7/metasploit-framework
4
##
5
6
require 'pathname'
7
require 'nokogiri'
8
require 'uri'
9
10
class CrawlerSimple < BaseParser
11
12
def parse(request,result)
13
return unless result['Content-Type'].include?('text/html')
14
15
# doc = Hpricot(result.body.to_s)
16
doc = Nokogiri::HTML(result.body.to_s)
17
doc.css('a').each do |anchor_tag|
18
hr = anchor_tag['href']
19
if hr && !hr.match(/^(\#|javascript\:)/)
20
begin
21
hreq = urltohash('GET', hr, request['uri'], nil)
22
insertnewpath(hreq)
23
rescue URI::InvalidURIError
24
#puts "Parse error"
25
#puts "Error: #{link[0]}"
26
end
27
end
28
end
29
end
30
end
31
32
33