CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
rapid7

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.

GitHub Repository: rapid7/metasploit-framework
Path: blob/master/lib/anemone/cli/url_list.rb
Views: 11780
1
require 'anemone'
2
require 'optparse'
3
require 'ostruct'
4
5
options = OpenStruct.new
6
options.relative = false
7
8
begin
9
# make sure that the last option is a URL we can crawl
10
root = URI(ARGV.last)
11
rescue
12
puts <<-INFO
13
Usage:
14
anemone url-list [options] <url>
15
16
Synopsis:
17
Crawls a site starting at the given URL, and outputs the URL of each page
18
in the domain as they are encountered.
19
20
Options:
21
-r, --relative Output relative URLs (rather than absolute)
22
INFO
23
exit(0)
24
end
25
26
# parse command-line options
27
opts = OptionParser.new
28
opts.on('-r', '--relative') { options.relative = true }
29
opts.parse!(ARGV)
30
31
Anemone.crawl(root, :discard_page_bodies => true) do |anemone|
32
33
anemone.on_every_page do |page|
34
if options.relative
35
puts page.url.path
36
else
37
puts page.url
38
end
39
end
40
41
end
42
43