Le Blog Invivoo
$
Posts
$
Beyond Data

Beyond Data

Les collaborateurs d’Invivoo voyagent à travers différents domaines d’expertises à la conquête de nouvelles connaissances et expériences à partager.

Creating your first Apache Airflow DAG

Creating your first Apache Airflow DAG

by Grow Together | 15 August 2023 | Big Data | 0 Comments

Throughout the past few years, Apache Airflow has established itself as the go-to data workflow management tool within any modern tech ecosystem. One of the main reasons for which Airflow rapidly became this popular is its simplicity and how easy it is to get it up...

Entity Matching : From Traditional Techniques to Deep Learning Solutions

Entity Matching : From Traditional Techniques to Deep Learning Solutions

by Zaineb Aridhi | 6 June 2023 | Machine Learning | 0 Comments

In today's world, where data is abundant and diverse, it has become crucial to organize information in a systematic manner. One way to do this is through referencing, which involves indexing products, information, files, objects, buildings, etc., and mentioning them...

Anomaly detection and three most used algorithms

Anomaly detection and three most used algorithms

by Celia Zhang | 8 March 2023 | Machine Learning | 0 Comments

1. Description Anomaly detection, also known as outlier detection, is a group of problems which purpose is to find out the samples perform differently from the majority. It is applied in so many domains: fraud detection in finance and insurance, default detection in...

How Big Data can contribute in reducing Banks’ churn rate?

How Big Data can contribute in reducing Banks’ churn rate?

by Terry Malik | 23 February 2021 | Big Data | 0 Comments

According to a study conducted by Efma, one in two customers is willing to change banks in the next six months. The reason being? The lack of personalized products and services. At a time when competition between banks is raging, it is essential that they change their...

Explore and analyze your data with apache zeppelin – part 2

Explore and analyze your data with apache zeppelin – part 2

by Grow Together | 24 June 2020 | Big Data | 0 Comments

Welcome back to our second part about Apache Zeppelin. In ‘EXPLORE & ANALYSE YOUR DATA WITH APACHE ZEPPELIN - Part 1’ our previous post, we introduced Apache Zeppelin as one of the best Big Data tools to your Data Analytics use cases and shared details about...

Apache Airflow: What is it and why you should start using it

Apache Airflow: What is it and why you should start using it

by Grow Together | 11 February 2020 | Big Data | 0 Comments

In this data-driven era, the number of open-source Big Data technologies rose exponentially in a matter of a few years. Because of this multitude of options, it results in the introduction of a vast range of patterns and architectures to store, process, and visualize...

Structured Streaming in Spark

Structured Streaming in Spark

by Grow Together | 6 December 2019 | Big Data | 0 Comments

Streaming processing is a set of techniques used to extract information from unbounded data (a type of dataset theoretically infinite in size) Some examples of streaming are device monitoring, fault detection, billing...

Discovering recommendation systems

Discovering recommendation systems

by Sotifa R. Adidagacila | 27 July 2019 | Beyond Data, Big Data, Machine Learning | 0 Comments

WHAT ARE RECOMMENDATION SYSTEMS ? We all wonder how Amazon or Netflix came to such "power" and success? How can Netflix know about our movie preferences? How did Amazon know the unconditional Games of Thrones’ fan that I am, that I love The North Face and Geography?...

Recommendation engine : from collective to personalized

Recommendation engine : from collective to personalized

by Dorra Dhouib | 3 June 2019 | Big Data | 0 Comments

The recommendation engine is at the heart of the business strategy of all e-commerce giants. For example, 35 percent of Amazon's e-commerce revenue is generated by its referral engine, according to a McKinsey study. We see every day the carousels of products that we...

Kafka: the Big Data streaming platform

Kafka: the Big Data streaming platform

by Morgan Grignard | 29 May 2019 | Big Data | 0 Comments

In modern information systems, we are confronted with ever-increasing volumes of data requiring to be processed in real time. However, the point-to-point connections commonly used do not allow easy loading scalability. Data producing services have a strong link with...

Paris Big Data conference: Couchbase that other NoSQL database, deserves your attention

Paris Big Data conference: Couchbase that other NoSQL database, deserves your attention

by Grow Together | 29 May 2019 | Big Data | 0 Comments

During this year’s edition of the Paris Big Data conference, amid an infinite set of booths filled with flashy promises of performance and scalability, one company stood out from the rest. Couchbase, the document-oriented NoSQL database, came to the conference armed...

Why is Spark Fast? And how to make it run faster? Part I: the Spark ABC

Why is Spark Fast? And how to make it run faster? Part I: the Spark ABC

by Grow Together | 6 March 2019 | Big Data | 0 Comments

This is the first article in a serie that'll discuss the mechanisms behind Apache Spark and how this data-processing Framework disrupted the Big Data ecosystem. While giving you key recommendations to fine-tune your Spark jobs. Spark does things fast. That has always...

Why is Spark Fast? And how to make it run faster? Part III. Getting Spark to the next level

Why is Spark Fast? And how to make it run faster? Part III. Getting Spark to the next level

by Grow Together | 6 March 2019 | Big Data | 0 Comments

This is the third and last article of the Spark-centered series. Reading the first and second parts is highly recommended before going through this one, in which we’ll discuss how you could optimize Spark jobs from your end of the spectrum. Throughout the past article...

Why is Spark Fast? And how to make it run faster ? Part II: The Spark Magic

by Grow Together | 27 February 2019 | Big Data | 0 Comments

This is the second article in the "Why is Spark Fast? And how to make it run faster "series". The serie discusses the mechanisms behind Apacha Spark and how this data-processing Framework disrupted the Big Data ecosystem. Reading the first part beforehand is...

Notebooks are The Missing Piece of the Big Data Revolution

Notebooks are The Missing Piece of the Big Data Revolution

by Grow Together | 20 December 2018 | Big Data | 0 Comments

More than a decade ago, what is now commonly known as the Big Data era started with the emergence of Hadoop. Since then, a multitude of technologies were introduced to fulfill multiple tasks within the Hadoop ecosystem, with capabilities ranging from processing data...

Monitoring and detection of anomalies with ELK

Monitoring and detection of anomalies with ELK

by Sotifa R. Adidagacila | 24 October 2018 | DevOps, Machine Learning | 0 Comments

MEASURING PERFORMANCE INDICATORS WITH ELK Monitoring and measuring IT applications’ performance indicators are a major challenge for companies. The evolution of technologies around qualification, storage and processing big data as well as machine learning has made it...

5 pre-requisites before launching a Big Data project

5 pre-requisites before launching a Big Data project

by Sotifa R. Adidagacila | 18 September 2018 | Big Data | 0 Comments

A BIG DATA PROJECT IS FULL OF PITFALLS Many pitfalls! Who are not only on IT! During my visit to the Big Data Salon, held in Paris on March 6 and 7, 2017, I was able to attend to 14 feedback, informative and operational at a time. I propose a summary of this visit...

Recherche

Psssst ! Par ici.

Newsletter

Catégories

Be & Do Agile !

Proposer un article