Scumblr and Sketchy - Search, Screenshot Internet

Форум Сообщества Практиков Конкурентной разведки (СПКР)

Конкурентная разведка, Бизнес-разведка, Корпоративная разведка,
Деловая разведка по открытым источникам в бизнесе.
Работаем строго в рамках закона.

Дезинформация и активные мероприятия в бизнесе
Форум Сообщества Практиков Конкурентной разведки (СПКР) »   Софт для конкурентной разведки »   Scumblr and Sketchy - Search, Screenshot Internet
RSS

Scumblr and Sketchy - Search, Screenshot Internet

<<Назад  Вперед>>Печать
 
Vinni
Администратор

Всего сообщений: 2136
Рейтинг пользователя: 22


Ссылка


Дата регистрации на форуме:
5 июня 2009
_ttp://techblog.netflix.com/2014/08/announcing-scumblr-and-sketchy-search.html

[q]

Netflix is pleased to announce the open source release of two security-related web applications: Scumblr and Sketchy!

Many security teams need to stay on the lookout for Internet-based discussions, posts, and other bits that may be of impact to the organizations they are protecting. These teams then take a variety of actions based on the nature of the findings discovered. Netflix’s security team has these same requirements, and today we’re releasing some of the tools that help us in these efforts.
Scumblr is a Ruby on Rails web application that allows searching the Internet for sites and content of interest. Scumblr includes a set of built-in libraries that allow creating searches for common sites like Google, Facebook, and Twitter. For other sites, it is easy to create plugins to perform targeted searches and return results. Once you have Scumblr setup, you can run the searches manually or automatically on a recurring basis.

Scumblr leverages a gem called Workflowable (which we are also open sourcing) that allows setting up flexible workflows that can be associated with search results. These workflows can be customized so that different types of results go through different workflow processes depending on how you want to action them. Workflowable also has a plug-in architecture that allows triggering custom automated actions at each step of the process.

Scumblr also integrates with Sketchy, which allows automatic screenshot generation of identified results to provide a snapshot-in-time of what a given page and result looked like when it was identified.

Scumblr makes use of the following components :
Ruby on Rails 4.0.9
Backend database for storing results
Redis + Sidekiq for background tasks
Workflowable for workflow creation and management
Sketchy for screenshot capture

We’re shipping Scumblr with built-in search libraries for seven common services including Google, Twitter, and Facebook.

Getting Started with Scumblr and Workflowable
Scumblr and Workflowable are available now on the Netflix Open Source site. Detailed instructions on setup and configuration are available in the projects’ wiki pages.

Sketchy
One of the features we wanted to see in Scumblr was the ability to collect screenshots and text content from potentially malicious sites - this allows security analysts to preview Scumblr results without the risk of visiting the site directly. We wanted this collection system to be isolated from Scumblr and also resilient to sites that may perform malicious actions. We also decided it would be nice to build an API that we could use in other applications outside of Scumblr. Although a variety of tools and frameworks exist for taking screenshots, we discovered a number of edge cases that made taking reliable screenshots difficult - capturing screenshots from AJAX-heavy sites, cut-off images with virtual X drivers, and SSL and compression issues in the PhantomJS driver for Selenium, to name a few. In order to solve these challenges, we decided to leverage the best possible tools and create an API framework that would allow for reliable, scalable, and easy to use screenshot and text scraping capabilities. Sketchy to the rescue!

Architecture:
At a high level, Sketchy contains the following components:

Python + Flask to serve Sketchy
PhantomJS to take lazy captures of AJAX heavy sites
Celery to manage jobs and + Redis to schedule and store job results
Backend database to store capture records (by leveraging SQLAlchemy)

Sketchy Overview
Sketchy at its core provides a scalable task-based framework to capture screenshots, scrape page text, and save HTML through a simple to use API. These captures can be stored locally or on an AWS S3 bucket. Optionally, token auth can be configured and callbacks can be used if required. Sketchy uses PhantomJS with lazy-rendering to ensure AJAX-heavy sites are captured correctly. Sketchy also uses the Celery task management system, allowing users to scale Sketchy accordingly and manage time-intensive captures for large sites.

Getting Started with Sketchy
Sketchy is available now on the Netflix Open Source site and setup is straightforward. In addition, we've also created a Docker for Sketchy for interested users. Please visit the Sketchy wiki for documentation on how to get started.

Conclusion
Scumblr and Sketchy are helping the Netflix security team keep an eye on potential threats to our environment every day. We hope that the open source community can find new and interesting uses for the newest additions to the Netflix Open Source Software initiative. Scumblr, Sketchy, and the Workflowable gem are all available on our GitHub site now!
[/q]
<<Назад  Вперед>>Печать
Форум Сообщества Практиков Конкурентной разведки (СПКР) »   Софт для конкурентной разведки »   Scumblr and Sketchy - Search, Screenshot Internet
RSS

Последние RSS
Англоязычный интерфейс программы "Сайт Спутник"
Учебник по конкурентной разведке
Конкурентная разведка: маркетинг рисков и возможностей
Книга "История частной разведки США"
Книга "Нетворкинг для разведчиков"
Поиск и сбор информации в интернете в программе СайтСпутник
Новые видеоуроки по программе СайтСпутник для начинающих
Технологическая разведка
SiteSputnik. Мониторинг Телеграм
СайтСпутник: возврат к ранее установленной версии
SiteSputnik. Доступ к результатам работы из браузера
Анализ URL
Браузер для анонимной работы
Топливно-энергетический комплекс
Профессиональные сообщества СБ
Несколько Проектов в одном, выполнение Проектов по частям
SiteSputnik-Bot: Боты в программе СайтСпутник
К вопросу о телеграм
SiteSputnik: Автозамены до и после Рубрикации или Перевода
Демо-доступ к ИАС социальных сетей

Самые активные 20 тем RSS