CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
rapid7

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place.

GitHub Repository: rapid7/metasploit-framework
Path: blob/master/lib/anemone/cli/pagedepth.rb
Views: 11780
1
require 'anemone'
2
3
begin
4
# make sure that the first option is a URL we can crawl
5
root = URI(ARGV[0])
6
rescue
7
puts <<-INFO
8
Usage:
9
anemone pagedepth <url>
10
11
Synopsis:
12
Crawls a site starting at the given URL and outputs a count of
13
the number of pages at each depth of the crawl.
14
INFO
15
exit(0)
16
end
17
18
Anemone.crawl(root) do |anemone|
19
anemone.skip_links_like %r{^/c/$}, %r{^/stores/$}
20
21
anemone.after_crawl do |pages|
22
pages = pages.shortest_paths!(root).uniq!
23
24
depths = pages.values.inject({}) do |depths, page|
25
depths[page.depth] ||= 0
26
depths[page.depth] += 1
27
depths
28
end
29
30
depths.sort.each { |depth, count| puts "Depth: #{depth} Count: #{count}" }
31
end
32
end
33
34