2 Critical GCP Message Queue Deployment Processes Explained

on Google Cloud Platform, Message Queue • May 12th, 2022 • Write for Hevo

GCP Message Queue FI

Google Cloud is a suite of Cloud computing services that works on the same infrastructure that Google leverages internally for its end-use products such as Gmail, Google Search, and Google Drive, to name a few. Its strength lies in Artificial Intelligence, Big Data processing tools, and Machine Learning initiatives along with container support.

This blog talks about the two methods you can leverage to deploy the GCP Message Queue for your Data Pipeline, namely, Cloudtasks and Pub/Sub. It also gives a brief introduction to Google Cloud Platform before diving into the nitty-gritty of Google Message Queue.

Table of Contents

What is GCP?

GCP Message Queue: GCP Logo
Image Source

Google Cloud Platform provides the same virtual machine functionality and core data storage as Azure and AWS, or any other cloud service provider.

Google’s DataFlow and BigQuery bring strong processing capabilities and analytics to the table for companies that largely dabble with data, while Google’s Kubernetes container technology allows for easier container deployment and container cluster management. With a slew of Machine Learning APIs and Google’s Cloud Machine Learning Engine at your disposal, you can easily leverage Artificial Intelligence in the Cloud.

Security and Data privacy features are also quite mature in the Google Cloud Platform. With Access Transparency, you can create near-real-time logs of when GCP support system engineers or support representatives interact with your data.

Key Features of GCP

GCP Message Queue: GCP Architecture
Image Source

Here are a few key features of GCP that allow it to stand out in the market:

  • Machine Learning and Artificial Intelligence: Deep Learning VM images provide preconfigured VMs for Deep Learning applications. With Vertex Data Labeling, you can easily manage the annotation for high-quality model training data. GCP offers Tensorflow Enterprise, which ensures performance and reliability for Artificial Intelligence apps with managed services and enterprise-grade support. Dialogflow serves as conversation systems and applications development suite for virtual agents.
  • Hybrid and Multicloud: Google Cloud provides a service for executing builds on Google Cloud infrastructure known as Cloud Build. Similarly, Anthos is a platform that can be used to build new apps and modernize existing ones.
  • Compute: With Cloud GPUs for Machine Learning, 3D visualization, and scientific computing you can set up a seamless framework for your Data Pipeline. With Shielded VMs, you can reinforce the virtual machines on the cloud pretty easily as well.

Accelerate GCP ETL using Hevo’s No-code Data Pipelines

Hevo Data, a No-code Data Pipeline helps to transfer data from 100+ sources like Google Analytics 360, and Google Cloud Storage, to a Data Warehouse or a destination of your choice like Google BigQuery and visualize analysis-ready data in your desired BI tool. 

Hevo is fully managed and completely automates the process of not only loading data from your desired source but also enriching the data and transforming it into an analysis-ready form without even having to write a single line of code. Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss.

Get Started with Hevo for Free

Hevo is the fastest, easiest, and most reliable data replication platform that will save your engineering bandwidth and time multifold. Try our 14-day full access free trial today to experience an entirely automated hassle-free Data Replication!

What is Google Cloudtasks?

GCP Message Queue: Google Cloud Tasks
Image Source

With Google Cloudtasks you can separate pieces of work that can be executed independently, outside of your main application flow, and send them for processing. You can do this asynchronously by leveraging handlers that you create. These independent pieces of work are known as Tasks. For instance, you need to update a database as a part of processing a user request, but updates might eat into your time. Offloading that detail as a task lets you return from the request more quickly.

The offloaded task then gets added to a queue, which persists the task until it is successfully executed. Based on your initial configuration, the queue can serve as a dispatch flow control. Here, you can generate and configure the queue which will then be managed by the Cloud Tasks service. Once the tasks get added, the queue promptly dispatches them and makes sure they are reliably processed by your workers. Intricacies associated with the process, such as server crashes, user-facing latency costs, retry management, and resource consumption limitations, can easily be handled by the service.

The tasks themselves consist of unique configuration information and name. Rarely, it might include data from the initial request, called the payload, required to process the request. Since the payload gets sent to the request body, tasks that consist of payloads need to use PUT or POST as their HTTP method.

Here are a few key use cases of Google Cloudtasks:

  • Maintaining requests to combat unexpected production incidents.
  • Speeding user response times by deputizing potentially slow background applications such as database updates to a worker.
  • Tackling third-party API call rates.
  • You can also leverage Google Cloudtasks to help ease traffic spikes by eliminating non-user-facing tasks from the main user flow.  

What is Google Pub/Sub?

GCP Message Queue: Google Pub/Sub
Image Source

Similar to Google Cloudtasks, Pub/Sub is a reliable and highly-available messaging service that can be used to send messages between applications. Here, individual consumers can retain messages for upto 7 days. On top of this, the service is entirely managed by Google Cloud. While there are various tools proficient in message passing (RabbitMQ, Kafka, Amazon SQS/SNS), Pub/Sub strikes the balance between maintaining a semblance of feature parity with Cloud Tasks while achieving an expedient migration.

Cloud Pub/Sub moves messages between subscribers and publishers connected by a shared topic. Cloud Task messages get queued until a customer is ready to receive them. Pub/Sub bisects the mechanism for de-queuing (i.e. through the subscription) and enqueuing tasks (i.e the publisher) in GCP. A subscription is leveraged by the application to extract messages that were published on the topic. Architecturally, this allows further decoupling of consumers and producers.

Compared to Cloud Tasks, Cloud Pub/Sub doesn’t put a ceiling on the message delivery rate and is globally available. However, everything isn’t hunky-dory with Cloud Pub/Sub either. Cloud Pub/Sub doesn’t provide tracking of delivery attempts, scheduled delivery of tasks, and deduping by task or message name. 

Cloudtasks vs Pub/Sub: Key Differences

The primary difference between Cloudtasks and Pub/Sub is the notion of explicit vs implicit invocation.

The main focus of Pub/Sub is to decouple publishers of events and subscribers to those events. Publishers can be oblivious of their subscribers, which means, publishers, give no control over the delivery of the messages except for the guarantee of delivery. In this way, Pub/Sub offers support for implicit invocation: a publisher implicitly causes the subscribers to execute by publishing an event.

On the other hand, the primary focus of Cloudtasks is an explicit invocation. Here, the publisher retains full control of the execution. Specifically, a publisher mentions an endpoint where each message needs to be delivered. On top of this, Cloud Tasks provide tools for the task and GCP Message Queue management that can’t be availed by Pub/Sub publishers, including:

  • Delivery Rate Controls
  • Scheduling Specific Delivery Times
  • Message/Task Creation Deduplication
  • Configurable Retries
  • Management and Access of Individual Tasks in a GCP Message Queue

In terms of geographic availability, Cloud Tasks are regional while the scope of Google Pub/Sub is global. The message/task retention for Cloud Tasks extends to 30 days while for Cloud Pub/Sub, it can be retained for 7 days.

Apart from this, the Cloud Pub/Sub messaging costs scale with the size of the message. The ceiling on message size for Cloud Pub/Sub messages is 10 MB, as opposed to only 100 KB for Cloud Tasks. With Cloud Tasks, your services can pass message content in-band instead of relying on an external store (such as Cloud Storage or Cloud BigTable) if they were smaller than the Cloud Tasks message size limit.

Cloud Pub/Sub’s pricing scales with the volume of data that gets transmitted ($40/TiB after the first 10 GiB), a sharp contrast to the vast number of raw queries.  

Understanding the GCP Message Queue Deployment Process

Here are the methods you can use to deploy the GCP Message Queue:

GCP Message Queue Deployment: Using Google Cloudtasks

You can create a general workflow for GCP Message Queue using Google Cloud Tasks as follows:

  • First, you need to generate a worker to process the tasks.
  • Next, you need to create a GCP Message queue.
  • After you’ve created a GCP Message queue, you can create the tasks programmatically and add them to the GCP Message queue.
  • The Cloud Tasks service returns an OK to the originating application. This shows that the task has been successfully written to Cloud Tasks storage, making the create task request both highly durable and available.
  • Next, the tasks get passed to the worker who will then process them.
  • Finally, to complete the sequence, the worker will return a 2xx success status code to the Cloud Tasks service.

Now that you’ve handed the task off to the GCP Message queue, no data is made available to the initial request.

What Makes Hevo’s Data Pipeline Best in Class?

Aggregating & loading your data from various applications can be a mammoth task without the right set of tools. Hevo’s automated platform empowers you with everything you need to have for a smooth Data Replication experience. Our platform has the following in store for you!

  • Exceptional Security: A Fault-tolerant Architecture that ensures Zero Data Loss.
  • Built to Scale: Exceptional Horizontal Scalability with Minimal Latency for Modern-data Needs.
  • Data Transformations: Process and Enrich Raw Granular Data using Hevo’s robust & built-in Transformation Layer without writing a single line of code.
  • Built-in Connectors: Support for 100+ Custom Data Sources, including Databases, SaaS Platforms, Native Webhooks, REST APIs, Files & More.
  • Auto Schema Mapping: Hevo takes away the tedious task of schema management & automatically detects the format of incoming data and replicates it to the destination schema. You can also choose between Full & Incremental Mappings to suit your Data Replication requirements.
  • Live Support: The Hevo team is available round the clock to extend exceptional support to its customers through chat, email, and support calls.
Sign up here for a 14-Day Free Trial!

GCP Message Queue Deployment: Using Google Pub/Sub

Here are the steps you can follow to deploy the GCP Message Queue by leveraging Google Pub/Sub:

Step 1: Getting Started

  • To deploy GCP Message Queue using Google Pub/Sub, first, you need to download and unzip the source repository for this process, or you can simply clone it by leveraging Git, with the following command:
git clone https://github.com/spring-guides/gs-messaging-gcp-pubsub.git
  • Next, you need to cd into gs-messaging-gcp-pubsub/initial.

Step 2: Adding Required Dependencies

  • To add required dependencies to GCP Message Queue through Maven, you need to add the following snippet to your pom.xml file:
<dependencies>
    ...
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-gcp-starter-pubsub</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.integration</groupId>
        <artifactId>spring-integration-core</artifactId>
    </dependency>
    ...
</dependencies>
  • If you’re using Maven, it is highly recommended to leverage the Spring Cloud GCP bill of materials to handle the various versions of your GCP Message Queue dependencies:
<properties>
    ...
    <spring-cloud-gcp.version>1.2.5.RELEASE</spring-cloud-gcp.version>
    ...
</properties>

<dependencyManagement>
    <dependencies>
       ...
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-gcp-dependencies</artifactId>
            <version>${spring-cloud-gcp.version}</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
        ...
    </dependencies>
</dependencyManagement>

Step 3: Setting up Google Cloud Pub/Sub Environment

  • To set up the Google Cloud Pub/Sub environment for GCP Message Queue, you will require a subscription and a topic to receive and send messages from Google Cloud Pub/Sub. You can generate them in the Google Cloud Console, or programmatically with the help of the PubSubAdmin class.
  • For this example, you need to create a topic named ‘testTopic’ and a subscription for that topic named ‘testSubscription’.

Step 4: Creating Application Files

  • To create application files for GCP Message Queue, you’ll require a class to incorporate the channel adapter along with GCP Message Queue’s configuration. Generate a PubSubApplication Class with the @SpringBootApplication header, as is usually seen with a Spring Boot application:
src/main/java/hello/PubSubApplication.java
@SpringBootApplication
public class PubSubApplication {

  public static void main(String[] args) throws IOException {
    SpringApplication.run(PubSubApplication.class, args);
  }

}
  • On top of this, since you are creating a web application, you can make a WebAppController class to differentiate controller and configuration logic.
src/main/java/hello/WebAppController.java
@RestController
public class WebAppController {
}
  • Next, you need to add the two missing files for HTML and properties:
src/main/resources/static/index.html
	<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Spring Integration GCP sample</title>
</head>
<body>
<div name="formDiv">
  <form action="/publishMessage" method="post">
    Publish message: <input type="text" name="message" /> <input type="submit" value="Publish!"/>
  </form>
</div>
</body>
</html>
src/main/resources/application.properties
#spring.cloud.gcp.project-id=[YOUR_GCP_PROJECT_ID_HERE]
#spring.cloud.gcp.credentials.location=file:[LOCAL_FS_CREDENTIALS_PATH]
  •  This would allow the Spring Cloud Core Boot Starter to auto-configure these two properties and make them optional. Properties from the properties file always take precedence over the Spring Boot configuration. The Spring Cloud Core Boot Starter is lumped with the Spring Cloud GCP Pub/Sub Boot starter.

Step 5: Creating an Inbound Channel Adapter

  • An inbound channel adapter listens to messages from a Google Cloud Pub/Sub subscription and forwards them to a Spring channel within an application.
  • Instantiating an inbound channel adapter needs a PubSubTemplate instance along with the name of an existing subscription. PubSubTemplate is Spring’s abstraction to subscribe to Google Cloud Pub/Sub topics. The Spring Cloud Pub/Sub Boot starter offers an auto-configured PubSubTemplate instance which can be inserted as a method argument.
src/main/java/hello/PubSubApplication.java
  @Bean
  public PubSubInboundChannelAdapter messageChannelAdapter(
    @Qualifier("pubsubInputChannel") MessageChannel inputChannel,
    PubSubTemplate pubSubTemplate) {
  PubSubInboundChannelAdapter adapter =
    new PubSubInboundChannelAdapter(pubSubTemplate, "testSubscription");
  adapter.setOutputChannel(inputChannel);
  adapter.setAckMode(AckMode.MANUAL);

  return adapter;
  }
  •  The message acknowledgment mode gets set in the adapter to automatic, by default. You can override this behavior, as shown in the instance given below. After the channel adapter is instantiated, an output channel where the adapter sends the received messages also needs to be configured.
src/main/java/hello/PubSubApplication.java
  @Bean
  public MessageChannel pubsubInputChannel() {
  return new DirectChannel();
  }
  • Next, you can observe that attached to an inbound channel is a service activator which can be used to process incoming messages.
src/main/java/hello/PubSubApplication.java
  @Bean
  @ServiceActivator(inputChannel = "pubsubInputChannel")
  public MessageHandler messageReceiver() {
  return message -> {
    LOGGER.info("Message arrived! Payload: " + new String((byte[]) message.getPayload()));
    BasicAcknowledgeablePubsubMessage originalMessage =
    message.getHeaders().get(GcpPubSubHeaders.ORIGINAL_MESSAGE, BasicAcknowledgeablePubsubMessage.class);
    originalMessage.ack();
  };
  }
  • The ServiceActivator input channel name needs to be the same as the input channel method name. Whenever a new message shows up in that channel, it gets processed by the returned MessageHandler. In this instance, the message is processed simply by logging its body and acknowledging it. In the event of manual acknowledgment, a message is acknowledged through the BasicAcknowledgeablePubsubMessage object, which is usually attached to the Message headers and can be easily extracted by leveraging the GcpPubSubHeaders.ORIGINAL_MESSAGE key.

Step 6: Creating an Outbound Channel Adapter

  • An outbound channel adapter will listen to new messages from a Spring Channel and publish them to a Google Cloud Pub/Sub topic.
  • To instantiate an outbound channel adapter, you need a PubSubTemplate along with the name of an existing topic. PubSubTemplate is Spring’s abstraction to publish messages to Google Cloud Pub/Sub topics. The Spring Cloud GCP Pub/Sub Boot starter offers an auto-configured PubSubTemplate instance.
src/main/java/hello/PubSubApplication.java
  @Bean
  @ServiceActivator(inputChannel = "pubsubOutputChannel")
  public MessageHandler messageSender(PubSubTemplate pubsubTemplate) {
  return new PubSubMessageHandler(pubsubTemplate, "testTopic");
  }
  • You can leverage a MessageGateway to write messages to a channel and publish them to Google Cloud Pub/Sub.
@MessagingGateway(defaultRequestChannel = "pubsubOutputChannel")
  public interface PubsubOutboundGateway {

  void sendToPubsub(String text);
  }
  • From this code snippet, Spring can auto-create an object that can then be auto-wired into a private field within the application.
  @Autowired
  private PubsubOutboundGateway messagingGateway;

Step 7: Adding Controller Logic

  • Next, you need to add logic to your controller that allows you to write to a Spring channel:
src/main/java/hello/WebAppController.java
package hello;

import hello.PubSubApplication.PubsubOutboundGateway;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.servlet.view.RedirectView;

@RestController
public class WebAppController {

  // tag::autowireGateway[]
  @Autowired
  private PubsubOutboundGateway messagingGateway;
  // end::autowireGateway[]

  @PostMapping("/publishMessage")
  public RedirectView publishMessage(@RequestParam("message") String message) {
  messagingGateway.sendToPubsub(message);
  return new RedirectView("/");
  }
}

Step 8: Authentication

  • You would have to authenticate your application either through the spring.cloud.gcp.credentials.location property or the GOOGLE_APPLICATION_CREDENTIALS environment variable. If you’ve installed Google Cloud SDK, you can log in with your user account by leveraging the gcloud auth application-default login command.
  • You can either package this service as a traditional WAR file for deployment to an external application server, or you can opt for the simpler approach, where you simply need to create a standalone application. In this method, you need to package everything in a single, executable JAR file, driven by a Java main() method. You can also leverage Spring’s support for embedding the Tomcat servlet container to serve as the HTTP runtime, as opposed to deploying to an external instance.
  • You can choose to run the application from the command line with Maven or Gradle. You can also construct a single executable JAR file that possesses all the required classes, dependencies, and resources you would need to run it seamlessly.
  • By building an executable JAR file, you can make it easier to version, ship, and deploy the service as an application throughout the development lifecycle, spanning different environments, and much more.
  • If you choose to use Gradle, you can run your application by leveraging ./gradlew bootRun. Alternatively, you can also build the JAR file by using the ./gradlew build command followed by running the JAR file, as follows:
java -jar build/libs/gs-messaging-gcp-pubsub-0.1.0.jar
  •  If you opt for Maven, you can run the application by leveraging ./mvnw spring-boot:run command. On the other hand, you can also build the JAR file with the help of the ./mvnw clean package followed by running the JAR file, as follows:
java -jar target/gs-messaging-gcp-pubsub-0.1.0.jar
  • Finally, the logging output is depicted. The service should now be up and running within a matter of seconds. 

Step 9: Testing the Application

  • Once the application is up and running, you can test it out. Open up http://localhost:8080, pen down a message in the input text box, and press the “Publish!” button. You also need to verify that the message was correctly logged within your process terminal window.

Conclusion

This blog talks about the different salient aspects of Google Message Queue, and the different methods you can implement to deploy it for your Data Pipeline. It includes the salient differences between Google Cloud Tasks and Google Pub/Sub as well. It also gives a brief introduction to Google Cloud Platform before diving into the nitty-gritty of Google Message Queue.

Though, these also require some technical knowledge for customizing your pipelines and for performing the simplest of the data transformations. Hence, these create potential bottlenecks as the business teams have to still wait on the Engineering teams for providing the data. You can streamline this process by opting for a Beginner Friendly Cloud Based No-Code Data Integration Platform like Hevo Data

Visit our Website to Explore Hevo

Hevo Data, a No-code Data Pipeline can Ingest Data in Real-Time from a vast sea of 100+ sources to a Data Warehouse, BI Tool, or a Destination of your choice. It is a reliable, completely automated, and secure service that doesn’t require you to write any code!  

If you are using CRMs, Sales, HR, and Marketing applications and searching for a no-fuss alternative to Manual Data Integration, then Hevo can effortlessly automate this for you. Hevo, with its strong integration with 100+ sources and BI tools(Including 40+ Free Sources), allows you to not only export & load data but also transform & enrich your data & make it analysis-ready in a jiffy. Want to take Hevo for a ride? Sign Up for a 14-day free trial and simplify your Data Integration process. Do check out the pricing details to understand which plan fulfills all your business needs.

No-code Data Pipeline for Your Data Warehouse