CrawlbaseAPI 1.1.0
dotnet add package CrawlbaseAPI --version 1.1.0
NuGet\Install-Package CrawlbaseAPI -Version 1.1.0
<PackageReference Include="CrawlbaseAPI" Version="1.1.0" />
paket add CrawlbaseAPI --version 1.1.0
#r "nuget: CrawlbaseAPI, 1.1.0"
// Install CrawlbaseAPI as a Cake Addin #addin nuget:?package=CrawlbaseAPI&version=1.1.0 // Install CrawlbaseAPI as a Cake Tool #tool nuget:?package=CrawlbaseAPI&version=1.1.0
Crawlbase
.NET library for scraping and crawling websites using the Crawlbase API.
Installation
See nuget package
Asynchronous Programming
Every method has a corresponding async version.
i.e.
Get
has an async version GetAsync
while,
Post
has an async version named PostAsync
,
and so on...
Crawling API Usage
Initialize the API with one of your account tokens, either normal or javascript token. Then make get or post requests accordingly.
You can get a token for free by creating a Crawlbase account and 1000 free testing requests. You can use them for tcp calls or javascript calls or both.
Crawlbase.API api = new Crawlbase.API("YOUR_TOKEN");
GET requests
Pass the url that you want to scrape plus any options from the ones available in the API documentation.
api.Get(url, options);
Example:
try {
api.Get("https://www.facebook.com/britneyspears");
Console.WriteLine(api.StatusCode);
Console.WriteLine(api.OriginalStatus);
Console.WriteLine(api.CrawlbaseStatus);
Console.WriteLine(api.Body);
} catch(Exception ex) {
Console.WriteLine(ex.ToString());
}
You can pass any options of what the Crawlbase API supports in exact dictionary params format.
Example:
api.Get("https://www.reddit.com/r/pics/comments/5bx4bx/thanks_obama/", new Dictionary<string, object>() {
{"user_agent", "Mozilla/5.0 (Windows NT 6.2; rv:20.0) Gecko/20121202 Firefox/30.0"},
{"format", "json"},
});
Console.WriteLine(api.StatusCode);
Console.WriteLine(api.Body);
Optionally pass store parameter to true
to store a copy of the API response in the Crawlbase Cloud Storage.
Example:
api.Get("https://www.reddit.com/r/pics/comments/5bx4bx/thanks_obama/", new Dictionary<string, object>() {
{"store", "true"},
});
Console.WriteLine(api.StorageURL);
Console.WriteLine(api.StorageRID);
POST requests
Pass the url that you want to scrape, the data that you want to send which can be either a json or a string, plus any options from the ones available in the API documentation.
api.Post(url, data, options);
Example:
api.Post("https://producthunt.com/search", new Dictionary<string, object>() {
{"text", "example search"},
});
Console.WriteLine(api.StatusCode);
Console.WriteLine(api.Body);
You can send the data as application/json instead of x-www-form-urlencoded by setting options post_content_type
as json.
api.Post("https://httpbin.org/post", new Dictionary<string, object>() {
{"some_json", "with some value"},
}, new Dictionary<string, object>() {
{"post_content_type", "json"},
});
Console.WriteLine(api.StatusCode);
Console.WriteLine(api.Body);
Javascript requests
If you need to scrape any website built with Javascript like React, Angular, Vue, etc. You just need to pass your javascript token and use the same calls. Note that only Get
is available for javascript and not Post
.
Crawlbase.API api = new Crawlbase.API("YOUR_JAVASCRIPT_TOKEN");
api.Get("https://www.nfl.com");
Console.WriteLine(api.StatusCode);
Console.WriteLine(api.Body);
Same way you can pass javascript additional options.
api.Get("https://www.freelancer.com", new Dictionary<string, object>() {
{"page_wait", "5000"},
});
Console.WriteLine(api.StatusCode);
Original status
You can always get the original status and crawlbase status from the response. Read the Crawlbase documentation to learn more about those status.
api.Get("https://sfbay.craigslist.org/");
Console.WriteLine(api.OriginalStatus);
Console.WriteLine(api.CrawlbaseStatus);
Scraper API usage
Initialize the Scraper API using your normal token and call the Get
method.
Crawlbase.ScraperAPI scraper_api = new Crawlbase.ScraperAPI("YOUR_TOKEN");
Pass the url that you want to scrape plus any options from the ones available in the Scraper API documentation.
scraper_api.Get(url, options);
Example:
try {
scraper_api.Get("https://www.amazon.com/Halo-SleepSack-Swaddle-Triangle-Neutral/dp/B01LAG1TOS");
Console.WriteLine(scraper_api.StatusCode);
Console.WriteLine(scraper_api.Body);
} catch(Exception ex) {
Console.WriteLine(ex.ToString());
}
Leads API usage
Initialize with your Leads API token and call the Get
method.
Crawlbase.LeadsAPI leads_api = new Crawlbase.LeadsAPI("YOUR_TOKEN");
try {
leads_api.Get("stripe.com");
Console.WriteLine(leads_api.StatusCode);
Console.WriteLine(leads_api.Body);
Console.WriteLine(leads_api.Success);
Console.WriteLine(leads_api.RemainingRequests);
foreach (var lead in leads_api.Leads)
{
Console.WriteLine(lead.Email);
foreach (var source in lead.Sources)
{
Console.WriteLine(source);
}
}
} catch(Exception ex) {
Console.WriteLine(ex.ToString());
}
If you have questions or need help using the library, please open an issue or contact us.
Screenshots API usage
Initialize with your Screenshots API token and call the Get
method.
Crawlbase.ScreenshotsAPI screenshots_api = new Crawlbase.ScreenshotsAPI("YOUR_TOKEN");
try {
screenshots_api.Get("https://www.apple.com");
Console.WriteLine(screenshots_api.StatusCode);
Console.WriteLine(screenshots_api.ScreenshotPath);
} catch(Exception ex) {
Console.WriteLine(ex.ToString());
}
or specifying a file path
Crawlbase.ScreenshotsAPI screenshots_api = new Crawlbase.ScreenshotsAPI("YOUR_TOKEN");
try {
screenshots_api.Get("https://www.apple.com", new Dictionary<string, object>() {
{"save_to_path", @"C:\Users\Default\Documents\apple.jpg"},
});
Console.WriteLine(screenshots_api.StatusCode);
Console.WriteLine(screenshots_api.ScreenshotPath);
} catch(Exception ex) {
Console.WriteLine(ex.ToString());
}
Note that screenshots_api.Get(url, options)
method accepts an options
Also note that screenshots_api.Body
is a Base64 string representation of the binary image file.
If you want to convert the body to bytes then you have to do the following:
byte[] bytes = Convert.FromBase64String(screenshots_api.Body);
Storage API usage
Initialize the Storage API using your private token.
Crawlbase.StorageAPI storage_api = new Crawlbase.StorageAPI("YOUR_TOKEN");
Pass the url that you want to get from Crawlbase Storage.
try {
var response = storage_api.GetByUrl("https://www.apple.com");
Console.WriteLine(storage_api.StatusCode);
Console.WriteLine(storage_api.Body);
Console.WriteLine(response.OriginalStatus);
Console.WriteLine(response.CrawlbaseStatus);
Console.WriteLine(response.URL);
Console.WriteLine(response.RID);
Console.WriteLine(response.StoredAt);
} catch(Exception ex) {
Console.WriteLine(ex.ToString());
}
or you can use the RID
try {
var response = storage_api.GetByRID(RID);
Console.WriteLine(storage_api.StatusCode);
Console.WriteLine(storage_api.Body);
Console.WriteLine(response.OriginalStatus);
Console.WriteLine(response.CrawlbaseStatus);
Console.WriteLine(response.URL);
Console.WriteLine(response.RID);
Console.WriteLine(response.StoredAt);
} catch(Exception ex) {
Console.WriteLine(ex.ToString());
}
Delete request
To delete a storage item from your storage area, use the correct RID
try {
bool success = storage_api.Delete(RID);
Console.WriteLine(success);
} catch(Exception ex) {
Console.WriteLine(ex.ToString());
}
Bulk request
To do a bulk request with a list of RIDs, please send the list of rids as an array
try {
var list = new List<string>();
list.Add(RID1);
list.Add(RID2);
list.Add(RIDn);
var responses = storage_api.Bulk(list);
Console.WriteLine(storage_api.StatusCode);
foreach (var response in responses)
{
Console.WriteLine(response.OriginalStatus);
Console.WriteLine(response.CrawlbaseStatus);
Console.WriteLine(response.URL);
Console.WriteLine(response.RID);
Console.WriteLine(response.StoredAt);
Console.WriteLine(response.Body);
}
} catch(Exception ex) {
Console.WriteLine(ex.ToString());
}
RIDs request
To request a bulk list of RIDs from your storage area
try {
var rids = storage_api.RIDs();
foreach (var rid in rids)
{
Console.WriteLine(rid);
}
} catch(Exception ex) {
Console.WriteLine(ex.ToString());
}
You can also specify a limit as a parameter
var rids = storage_api.RIDs(100);
Total Count
To get the total number of documents in your storage area
try {
var totalCount = storage_api.TotalCount();
Console.WriteLine(totalCount);
} catch(Exception ex) {
Console.WriteLine(ex.ToString());
}
If you have questions or need help using the library, please open an issue or contact us.
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/crawlbase-source/crawlbase-net. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.
License
The library is available as open source under the terms of the MIT License.
Code of Conduct
Everyone interacting in the Crawlbase project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.
Copyright 2023 Crawlbase
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.0 is compatible. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
.NET Framework | net45 is compatible. net451 was computed. net452 was computed. net46 was computed. net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETCoreApp 2.0
- Newtonsoft.Json (>= 13.0.1)
-
.NETFramework 4.5
- Newtonsoft.Json (>= 13.0.1)
-
.NETStandard 2.0
- Newtonsoft.Json (>= 13.0.1)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
Supports
- Crawling API
- Scraper API
- Leads API
- Screenshots API
- Storage API