Logo Dataclair

CELL2VEC BTS embeddings for advanced network diagnostics

mobile 2

Overview

One of our main fields of focus is using cell-tower data to improve network performance and customer experience. We have proved that word2vec, a neural-network technique developed to understand human languages, can interpret raw cell-tower data and potentially improve network performance. 

map graf

Fig 1: Visualisation of the cell2vec output after dimensionality reduction

The problem of messy and unreliable data

With these techniques, we overcome the problem of messy, unreliable data resulting from SIM cards connecting to network base transceiver stations. The main problem is that network base stations were never designed to provide meaningful location data. Their connections to individual devices can appear quite random, and many handovers between cells are not recorded. A known route, such as a journey by train, appears to jump unpredictably between base stations, according to the recorded data, making it very difficult to pinpoint the location from this source alone.

We turned to word2vec

It is in contrary to the common belief that a mobile provider has very accurate location information of its customers. Meanwhile, GPS data is only available to phone operating-system providers and apps with which customers have agreed to share the data. After unsuccessfully trying all the common used techniques for these problems, we turned to word2vec, to find out if it could reveal the locations of those base stations from raw network data without any additional tagging or interpretation. 

Just streams of plain text

We used absolutely no other information; just streams of plain text containing the cell ID tokens and with the help of word2vec we encoded it into latent representations of the base stations state. These representations contain all conceivable and also inconceivable information about the stations. One of our most exciting moments, and impulses for further work, was finding that with the help of word2vec and Uniform Manifold Approximation and Projection (UMAP), we are able to get the longitude and latitude coordinates of each tower. Just from the raw stream of network data. What else can be done with that? This is our ongoing internal research and development in close cooperation with O2’s network experts.

Credits

Jan Romportl

JAN ROMPORTL Director

show more
Katarína Vlčková

KATARÍNA VLČKOVÁ Data Science Team Lead

show more
NEXT CASE STUDY O2 CZ TELCO-BASED CREDIT RISK SCORING DISCOVER MORE mobile 1