CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
rapid7

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.

GitHub Repository: rapid7/metasploit-framework
Path: blob/master/lib/anemone/extractors/generic.rb
Views: 11780
1
require 'uri'
2
3
class Anemone::Extractors::Generic < Anemone::Extractors::Base
4
5
def run
6
URI.extract( doc.to_s, %w(http https) ).map do |u|
7
#
8
# This extractor needs to be a tiny bit intelligent because
9
# due to its generic nature it'll inevitably match some garbage.
10
#
11
# For example, if some JS code contains:
12
#
13
# var = 'http://blah.com?id=1'
14
#
15
# or
16
#
17
# var = { 'http://blah.com?id=1', 1 }
18
#
19
#
20
# The URI.extract call will match:
21
#
22
# http://blah.com?id=1'
23
#
24
# and
25
#
26
# http://blah.com?id=1',
27
#
28
# respectively.
29
#
30
if !includes_quotes?( u )
31
u
32
else
33
if html.include?( "'#{u}" )
34
u.split( '\'' ).first
35
elsif html.include?( "\"#{u}" )
36
u.split( '"' ).first
37
else
38
u
39
end
40
end
41
end
42
rescue
43
[]
44
end
45
46
def includes_quotes?( url )
47
url.include?( '\'' ) || url.include?( '"' )
48
end
49
50
end
51
52