当前位置: 首页 > news >正文

SnappyData Tutorial

2019独角兽企业重金招聘Python工程师标准>>> hot3.png

1. Spark和SnappyData独立运行:

启动Spark

cd /opt/spark-2.1.1-bin-hadoop2.7/sbin
./start-master.sh
./start-slave.sh  --master=spark://locahost:7077

启动SnappyData:

mkdir -p /opt/snappydata-1.0.1-bin/work/localhost-locator-1
mkdir -p /opt/snappydata-1.0.1-bin/work/localhost-server-1
mkdir -p /opt/snappydata-1.0.1-bin/work/localhost-lead-1
cd /opt/snappydata-1.0.1-bin/bin/
../snappy locator start  -dir=/opt/snappydata-1.0.1-bin/work/localhost-locator-1
../snappy server start  -dir=/opt/snappydata-1.0.1-bin/work/localhost-server-1  -locators=localhost[10334] -heap-size=8g
../snappy leader start  -dir=/opt/snappydata-1.0.1-bin/work/localhost-lead-1  -locators=localhost[10334] -spark.executor.cores=4

运行example:

package org.apache.spark.examples.snappydata

import org.apache.spark.sql.{SnappySession, SparkSession}

/**
 * This example shows how an application can interact with SnappyStore in Split cluster mode.
 * By this mode an application can access metastore of an existing running SnappyStore. Hence it can
 * query tables, write to tables which reside in a SnappyStore.
 *
 * To run this example you need to set up a Snappy Cluster first . To do the same, follow the steps
 * mentioned below.
 *
 * 1.  Go to SNAPPY_HOME. Your Snappy installation directory.
 *
 * 2.  Start a Snappy cluster
 * ./sbin/snappy-start-all.sh
 * This will start a simple cluster with one data node, one lead node and a locator
 *
 * 3.  Open Snappy Shell
 * ./bin/snappy-sql
 * This will open Snappy shell which can be used to create and query tables.
 *
 * 4. Connect to the Snappy Cluster. On the shell prompt type
 * connect client 'localhost:1527';
 *
 * 5. Create a column table and insert some rows in SnappyStore. Type the followings in Snappy Shell.
 *
 * CREATE TABLE SNAPPY_COL_TABLE(r1 Integer, r2 Integer) USING COLUMN;
 *
 * insert into SNAPPY_COL_TABLE VALUES(1,1);
 * insert into SNAPPY_COL_TABLE VALUES(2,2);
 *
 * 6. Run this example to see how this program interacts with the Snappy Cluster
 * table (SNAPPY_COL_TABLE) that we created. This program also creates a table in SnappyStore.
 * After running this example you can also query the table from Snappy shell
 * e.g. select count(*) from TestColumnTable.
 *
 * bin/run-example snappydata.SmartConnectorExample
 *
 */

object SmartConnectorExample {

  def main(args: Array[String]): Unit = {

    val builder = SparkSession
      .builder
      .appName("SmartConnectorExample")
      .master("spark://localhost:7077")
      // snappydata.connection property enables the application to interact with SnappyData store
      .config("snappydata.connection", "localhost:1527")


    args.foreach( prop => {
      val params = prop.split("=")
      builder.config(params(0), params(1))
    })

    val spark: SparkSession = builder
        .getOrCreate
    val snSession = new SnappySession(spark.sparkContext)

    println("\n\n ####  Reading from the SnappyStore table SNAPPY_COL_TABLE  ####  \n")
    val colTable = snSession.table("SNAPPY_COL_TABLE")
    colTable.show(10)


    println(" ####  Creating a table TestColumnTable  #### \n")

    snSession.dropTable("TestColumnTable", ifExists = true)

    // Creating a table from a DataFrame
    val dataFrame = snSession.range(1000).selectExpr("id", "floor(rand() * 10000) as k")

    snSession.sql("create table TestColumnTable (id bigint not null, k bigint not null) using column")

    dataFrame.write.insertInto("TestColumnTable")

    println(" ####  Write to table completed. ### \n\n" +
        "Now you can query table TestColumnTable using $SNAPPY_HOME/bin/snappy-shell")

  }

}

 

2. Spark和SnappyData集成运行:

 

Getting Started with your Spark Distribution

If you are a Spark developer and already using Spark 2.1.1 the fastest way to work with SnappyData is to add SnappyData as a dependency. For instance, using the package option in the Spark shell.

Open a command terminal, go to the location of the Spark installation directory, and enter the following:

$ cd <Spark_Install_dir>
# Create a directory for SnappyData artifacts
$ mkdir quickstartdatadir
$ ./bin/spark-shell --conf spark.snappydata.store.sys-disk-dir=quickstartdatadir --conf spark.snappydata.store.log-file=quickstartdatadir/quickstart.log --packages "SnappyDataInc:snappydata:1.0.1-s_2.11"

This opens the Spark shell and downloads the relevant SnappyData files to your local machine. Depending on your network connection speed, it may take some time to download the files.
All SnappyData metadata, as well as persistent data, is stored in the directory quickstartdatadir. The spark-shell can now be used to work with SnappyData using Scala APIs and SQL.

How to Use Spark-shell to run snappy-sql

After open spark shell, we must import snappy:

scala> import org.apache.spark.sql.{SnappySession, SparkSession}
scala> val snappy = new org.apache.spark.sql.SnappySession(spark.sparkContext)

then we can run snappy sql:

scala> snappy.sql("CREATE TABLE SNAPPY_COL_TABLE(r1 Integer, r2 Integer) USING COLUMN")
scala> snappy.sql("insert into SNAPPY_COL_TABLE VALUES(1,1)")
scala> snappy.sql("insert into SNAPPY_COL_TABLE VALUES(2,2)")
scala> snappy.sql("select count(*) from SNAPPY_COL_TABLE")

 

转载于:https://my.oschina.net/u/2935389/blog/1819826

相关文章:

  • 持续集成之代码质量管理-Sonar [三]
  • Sharding-Sphere 3.X 与spring与mybatis集成(分库分表)demo
  • tomcat服务的启动与隐藏启动(win)
  • Linux学习总结(五十六)监控zabbix部署 上篇
  • 关于HTTP的一些基本概念
  • 【Datastage】函数大全
  • 关于kubernetes拉取私库镜像需要注意的点
  • Python学习之路16-使用API
  • 报错:在做往下拉选里面拼接数据的时候 3个下拉选显示一个值 原因 @scope(单例)或者没配默认单例...
  • flask接收请求并推入栈
  • 从PRISM开始学WPF(八)导航Navigation?
  • 手把手教你将单机游戏改造成对战网游(附详细教程)
  • P2264 情书
  • Spring Boot的@Service和@Autowired和@ComponentScan注解
  • 两个变量交换的四种方法(Java)
  • 【Amaple教程】5. 插件
  • 4月23日世界读书日 网络营销论坛推荐《正在爆发的营销革命》
  • Angular js 常用指令ng-if、ng-class、ng-option、ng-value、ng-click是如何使用的?
  • Bytom交易说明(账户管理模式)
  • log4j2输出到kafka
  • MySQL几个简单SQL的优化
  • Redux系列x:源码分析
  • Zsh 开发指南(第十四篇 文件读写)
  • 给github项目添加CI badge
  • 讲清楚之javascript作用域
  • 思维导图—你不知道的JavaScript中卷
  • 项目实战-Api的解决方案
  • 带你开发类似Pokemon Go的AR游戏
  • ​io --- 处理流的核心工具​
  • ​LeetCode解法汇总2182. 构造限制重复的字符串
  • ​中南建设2022年半年报“韧”字当头,经营性现金流持续为正​
  • !! 2.对十份论文和报告中的关于OpenCV和Android NDK开发的总结
  • #WEB前端(HTML属性)
  • (10)Linux冯诺依曼结构操作系统的再次理解
  • (70min)字节暑假实习二面(已挂)
  • (js)循环条件满足时终止循环
  • (二)基于wpr_simulation 的Ros机器人运动控制,gazebo仿真
  • (七)Java对象在Hibernate持久化层的状态
  • (算法设计与分析)第一章算法概述-习题
  • (轉貼)《OOD启思录》:61条面向对象设计的经验原则 (OO)
  • .\OBJ\test1.axf: Error: L6230W: Ignoring --entry command. Cannot find argumen 'Reset_Handler'
  • .md即markdown文件的基本常用编写语法
  • .NET C# 使用 SetWindowsHookEx 监听鼠标或键盘消息以及此方法的坑
  • .NET Core中Emit的使用
  • .NET Micro Framework 4.2 beta 源码探析
  • .NET/C# 如何获取当前进程的 CPU 和内存占用?如何获取全局 CPU 和内存占用?
  • .net获取当前url各种属性(文件名、参数、域名 等)的方法
  • .net快速开发框架源码分享
  • .NET中使用Redis (二)
  • .sys文件乱码_python vscode输出乱码
  • /etc/apt/sources.list 和 /etc/apt/sources.list.d
  • @JsonFormat与@DateTimeFormat注解的使用
  • @JsonSerialize注解的使用
  • @NoArgsConstructor和@AllArgsConstructor,@Builder
  • []使用 Tortoise SVN 创建 Externals 外部引用目录