Do you classify web sites per URL?

Classification per URL term is overused in the industry, it gives the false sense that every URL is classified and some companies boast that they are classifying billions of URLs, in reality what happens is that there are number of domains that are indeed checked per URL, and the rest are considered as one category for the entire domain.

The best example for that would be Wikipedia, in our service we classify it as reference (for the entire domain), you can test any Wikipedia volume with the competing services that claim per URL classification and see that it is classified as reference (or similar) regardless of the topic of the volume.

Can you classify per URL?

We have an option to request the server to classify a URL on the fly, regardless of the domain category, this can be used for instances that the client requires to go even deeper with sites like YouTube, Wikipedia, etc.

We also have the Komodia PerPageSDK that allows you to classify each page on the fly at the client level.

Do you classify web searches?

Yes, the server detects when a URL is for a search engine and part of a search query, it will extract the keyword and classify the keyword with the database of over 12 million keywords and phrases in 20 languages.

Do you classify a keywords or phrases?

Yes, the server can accept a single keyword or a phrase for classification.

Do you classify web sites manually?

Some of the URLs are classified manually, but most of the classification is done by algorithms, some companies claim they classify all web page manually, we find that hard to believe since some sites has over 90 million different sub sites, for example if you run this query with Google: “site:blogger.com” you’ll see how many blogger sub sites and pages exists, you can’t classify this amount manually.

How do you handle new web site?

When we get a request for a new web site the server classify it on the fly (so you don’t have to wait for someone to review it) and adds it to our sites database.

Do you classify images?

There’s no reliable technology to do image classification, there are attempts to use computer vision but they have 50% false positive rate. Images are classified as the domain is, or if you plan to filter image search, you can filter the search phrase itself.

It’s important to understand that images are coming in context (HTML, Search phrase) and the context is what gives away the category.

How many URLs are in your database?

This question is relevant when you’re unable to classify web pages in real time, so beyond a certain number of URLs this number have no meaning when you do on the fly classification like Komodia.

Anyways in our database we have 10 millions URLs.

What is your service coverage?

Our servers are located strategically around the globe to give good coverage for end users from different geography, in case we have a client that has most of his users from one country we may add a server specifically for that country.

How often do you update your database?

We are updating our results every 7-10 days.

Who needs to manage the servers?

We manage Komodia’s servers which are accessed by all our clients, you can request a dedicated server just for your clients, we can manage it for you, or you can manage it yourself.

Can I manage my own server?

Some companies prefer to manage their own server for security reasons, we can provide the server software for deployment at clients’ site.

How can you offer such a low price vs. the competition?

We spent one year designing the server with a perquisite that it will be cheap to maintain and not require large teams of professional which is required for companies running old generation classifiers.