The code powering m.abunchtell.com https://m.abunchtell.com
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 

148 lines
5.3 KiB

  1. # frozen_string_literal: true
  2. require 'concurrent'
  3. require_relative '../../config/boot'
  4. require_relative '../../config/environment'
  5. require_relative 'cli_helper'
  6. module Mastodon
  7. class DomainsCLI < Thor
  8. def self.exit_on_failure?
  9. true
  10. end
  11. option :dry_run, type: :boolean
  12. desc 'purge DOMAIN', 'Remove accounts from a DOMAIN without a trace'
  13. long_desc <<-LONG_DESC
  14. Remove all accounts from a given DOMAIN without leaving behind any
  15. records. Unlike a suspension, if the DOMAIN still exists in the wild,
  16. it means the accounts could return if they are resolved again.
  17. LONG_DESC
  18. def purge(domain)
  19. removed = 0
  20. dry_run = options[:dry_run] ? ' (DRY RUN)' : ''
  21. Account.where(domain: domain).find_each do |account|
  22. SuspendAccountService.new.call(account, destroy: true) unless options[:dry_run]
  23. removed += 1
  24. say('.', :green, false)
  25. end
  26. DomainBlock.where(domain: domain).destroy_all
  27. say
  28. say("Removed #{removed} accounts#{dry_run}", :green)
  29. end
  30. option :concurrency, type: :numeric, default: 50, aliases: [:c]
  31. option :silent, type: :boolean, default: false, aliases: [:s]
  32. option :format, type: :string, default: 'summary', aliases: [:f]
  33. desc 'crawl [START]', 'Crawl all known peers, optionally beginning at START'
  34. long_desc <<-LONG_DESC
  35. Crawl the fediverse by using the Mastodon REST API endpoints that expose
  36. all known peers, and collect statistics from those peers, as long as those
  37. peers support those API endpoints. When no START is given, the command uses
  38. this server's own database of known peers to seed the crawl.
  39. The --concurrency (-c) option controls the number of threads performing HTTP
  40. requests at the same time. More threads means the crawl may complete faster.
  41. The --silent (-s) option controls progress output.
  42. The --format (-f) option controls how the data is displayed at the end. By
  43. default (`summary`), a summary of the statistics is returned. The other options
  44. are `domains`, which returns a newline-delimited list of all discovered peers,
  45. and `json`, which dumps all the aggregated data raw.
  46. LONG_DESC
  47. def crawl(start = nil)
  48. stats = Concurrent::Hash.new
  49. processed = Concurrent::AtomicFixnum.new(0)
  50. failed = Concurrent::AtomicFixnum.new(0)
  51. start_at = Time.now.to_f
  52. seed = start ? [start] : Account.remote.domains
  53. pool = Concurrent::ThreadPoolExecutor.new(min_threads: 0, max_threads: options[:concurrency], idletime: 10, auto_terminate: true, max_queue: 0)
  54. work_unit = ->(domain) do
  55. next if stats.key?(domain)
  56. stats[domain] = nil
  57. processed.increment
  58. begin
  59. Request.new(:get, "https://#{domain}/api/v1/instance").perform do |res|
  60. next unless res.code == 200
  61. stats[domain] = Oj.load(res.to_s)
  62. end
  63. Request.new(:get, "https://#{domain}/api/v1/instance/peers").perform do |res|
  64. next unless res.code == 200
  65. Oj.load(res.to_s).reject { |peer| stats.key?(peer) }.each do |peer|
  66. pool.post(peer, &work_unit)
  67. end
  68. end
  69. Request.new(:get, "https://#{domain}/api/v1/instance/activity").perform do |res|
  70. next unless res.code == 200
  71. stats[domain]['activity'] = Oj.load(res.to_s)
  72. end
  73. say('.', :green, false) unless options[:silent]
  74. rescue StandardError
  75. failed.increment
  76. say('.', :red, false) unless options[:silent]
  77. end
  78. end
  79. seed.each do |domain|
  80. pool.post(domain, &work_unit)
  81. end
  82. sleep 20
  83. sleep 20 until pool.queue_length.zero?
  84. pool.shutdown
  85. pool.wait_for_termination(20)
  86. ensure
  87. pool.shutdown
  88. say unless options[:silent]
  89. case options[:format]
  90. when 'summary'
  91. stats_to_summary(stats, processed, failed, start_at)
  92. when 'domains'
  93. stats_to_domains(stats)
  94. when 'json'
  95. stats_to_json(stats)
  96. end
  97. end
  98. private
  99. def stats_to_summary(stats, processed, failed, start_at)
  100. stats.compact!
  101. total_domains = stats.size
  102. total_users = stats.reduce(0) { |sum, (_key, val)| val.is_a?(Hash) && val['stats'].is_a?(Hash) ? sum + val['stats']['user_count'].to_i : sum }
  103. total_active = stats.reduce(0) { |sum, (_key, val)| val.is_a?(Hash) && val['activity'].is_a?(Array) && val['activity'].size > 2 && val['activity'][1].is_a?(Hash) ? sum + val['activity'][1]['logins'].to_i : sum }
  104. total_joined = stats.reduce(0) { |sum, (_key, val)| val.is_a?(Hash) && val['activity'].is_a?(Array) && val['activity'].size > 2 && val['activity'][1].is_a?(Hash) ? sum + val['activity'][1]['registrations'].to_i : sum }
  105. say("Visited #{processed.value} domains, #{failed.value} failed (#{(Time.now.to_f - start_at).round}s elapsed)", :green)
  106. say("Total servers: #{total_domains}", :green)
  107. say("Total registered: #{total_users}", :green)
  108. say("Total active last week: #{total_active}", :green)
  109. say("Total joined last week: #{total_joined}", :green)
  110. end
  111. def stats_to_domains(stats)
  112. say(stats.keys.join("\n"))
  113. end
  114. def stats_to_json(stats)
  115. stats.compact!
  116. say(Oj.dump(stats))
  117. end
  118. end
  119. end