CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
rapid7

CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!

GitHub Repository: rapid7/metasploit-framework
Path: blob/master/lib/anemone/cli/pagedepth.rb
Views: 1904
1
require 'anemone'
2
3
begin
4
# make sure that the first option is a URL we can crawl
5
root = URI(ARGV[0])
6
rescue
7
puts <<-INFO
8
Usage:
9
anemone pagedepth <url>
10
11
Synopsis:
12
Crawls a site starting at the given URL and outputs a count of
13
the number of pages at each depth of the crawl.
14
INFO
15
exit(0)
16
end
17
18
Anemone.crawl(root) do |anemone|
19
anemone.skip_links_like %r{^/c/$}, %r{^/stores/$}
20
21
anemone.after_crawl do |pages|
22
pages = pages.shortest_paths!(root).uniq!
23
24
depths = pages.values.inject({}) do |depths, page|
25
depths[page.depth] ||= 0
26
depths[page.depth] += 1
27
depths
28
end
29
30
depths.sort.each { |depth, count| puts "Depth: #{depth} Count: #{count}" }
31
end
32
end
33
34