当前位置：首页 > news >正文

二次排序

news 来源：原创 2024/5/17 3:35:38

样例输入：

file1　　　　　　file2

实验思路：我们设置一个新类myNewKey，这个类继承WritableComparable接口。然后我们把myNewKey写入map中，然后在map阶段中，实现自动排序，我们只需在reduce阶段输出就可以了。

代码：

　　MyNewKey.java:

import java.io.DataInput;

import java.io.DataOutput;

import java.io.IOException;

import org.apache.hadoop.io.WritableComparable;

public class MyNewKey implements WritableComparable<MyNewKey> {

long firstNum;

long secondNum;

public MyNewKey() {

}

public MyNewKey(long first, long second) {

firstNum = first;

secondNum = second;

}

@Override

public void write(DataOutput out) throws IOException {

out.writeLong(firstNum);

out.writeLong(secondNum);

}

@Override

public void readFields(DataInput in) throws IOException {

firstNum = in.readLong();

secondNum = in.readLong();

}

* 当key进行排序时会调用以下这个compreTo方法

@Override

public int compareTo(MyNewKey anotherKey) {

long min = firstNum - anotherKey.firstNum;

if (min != 0) {

// 说明第一列不相等，则返回两数之间小的数

return (int) min;

} else {

return (int) (secondNum - anotherKey.secondNum);

}

public long getFirstNum() {

return firstNum;

}

public long getSecondNum() {

return secondNum;

}

现在是我们的mapreduce代码：

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.NullWritable;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class doublesort {

static String INPUT_PATH = "hdfs://master:9000/ls";

static String OUTPUT_PATH="hdfs://master:9000/output";

static class MyMapper extends Mapper<Object,Object,MyNewKey,NullWritable>{

MyNewKey output_key = new MyNewKey();

NullWritable output_value = NullWritable.get();

protected void map(Object key,Object value,Context context) throws IOException,InterruptedException{

String[] tokens = value.toString().split(",",2);

MyNewKey output_key=new MyNewKey(Long.parseLong(tokens[0]),Long.parseLong(tokens[1]));

context.write(output_key, output_value);

}

static class MyReduce extends Reducer<MyNewKey,NullWritable,LongWritable,LongWritable> {

LongWritable output_key = new LongWritable();

LongWritable output_value = new LongWritable();

protected void reduce(MyNewKey key, Iterable<NullWritable> values,Context context) throws IOException,InterruptedException{

output_key.set(key.getFirstNum());

output_value.set(key.getSecondNum());

context.write(output_key, output_value);

}

public static void main(String[] args) throws Exception{

Path outputpath = new Path(OUTPUT_PATH);

Configuration conf = new Configuration();

FileSystem fs = outputpath.getFileSystem(conf);

if(fs.exists(outputpath)){

fs.delete(outputpath,true);

}

Job job=Job.getInstance(conf);

FileInputFormat.setInputPaths(job,INPUT_PATH);

FileOutputFormat.setOutputPath(job, outputpath);

job.setMapperClass(MyMapper.class);

//job.setPartitionerClass(MyPartitioner.class);

//job.setNumReduceTasks(2);

job.setReducerClass(MyReduce.class);

job.setMapOutputKeyClass(MyNewKey.class);

job.setMapOutputValueClass(NullWritable.class);

job.setOutputKeyClass(LongWritable.class);

job.setOutputValueClass(LongWritable.class);

job.waitForCompletion(true);

}

输出结果：

转载于:https://www.cnblogs.com/luminous1/p/8386510.html

简述 Spring Cloud 是什么

OSS Web直传（文件图片）

推荐一款sublime text 3 支持JSX和es201x 代码格式化的插件

浅谈RxJava

android studio 3.0 Ndk 开发- 利用增量更新进行 apk的覆盖安装

Teamviewer原理和阻止方法

【BIEE】11_根据显示指标展示不同报表

流程（上）

好领导：提升领导威信力的110个管理奥秘

我的重构第二步

部署eolinker开源版接口管理

基于django的生成二维码的接口

09-移动端开发教程-Sass入门

while循环按行读文件的方式总结

ElasticSearch「1」本地安裝Elasticsearch 6.0.1 + Elasticsearch-head插件

9月CHINA-PUB-OPENDAY技术沙龙——IPHONE

[译]前端离线指南（上）

【407天】跃迁之路——程序员高效学习方法论探索系列（实验阶段164-2018.03.19）...

【翻译】babel对TC39装饰器草案的实现

Akka系列（七）：Actor持久化之Akka persistence

Django 博客开发教程 8 - 博客文章详情页

IP路由与转发

mac修复ab及siege安装

Nodejs和JavaWeb协助开发

Terraform入门 - 1. 安装Terraform

Terraform入门 - 3. 变更基础设施

vuex 笔记整理

从零开始在ubuntu上搭建node开发环境

对话 CTO〡听神策数据 CTO 曹犟描绘数据分析行业的无限可能

湖南卫视：中国白领因网络偷菜成当代最寂寞的人?

聊聊flink的TableFactory

猫头鹰的深夜翻译：JDK9 NotNullOrElse方法

前端学习笔记之原型——一张图说明`prototype`和`__proto__`的区别

力扣解法汇总1802. 有界数组中指定下标处的最大值

###项目技术发展史

（175）FPGA门控时钟技术

(AngularJS)Angular 控制器之间通信初探

(非本人原创)我们工作到底是为了什么？——HP大中华区总裁孙振耀退休感言（r4笔记第60天)...

(实战)静默dbca安装创建数据库 --参数说明+举例

(使用vite搭建vue3项目（vite + vue3 + vue router + pinia + element plus）)

(一)为什么要选择C++

(转)MVC3 类型“System.Web.Mvc.ModelClientValidationRule”同时存在

(转)scrum常见工具列表

(转)拼包函数及网络封包的异常处理(含代码)

（转）项目管理杂谈-我所期望的新人

.NET 8.0 发布到 IIS

.net 前台table如何加一列下拉框_如何用Word编辑参考文献

.NET/C# 使用反射注册事件

.NET设计模式（2）：单件模式（Singleton Pattern）

.Net下使用 Geb.Video.FFMPEG 操作视频文件

@31省区市高考时间表来了，祝考试成功

@for /l %i in (1,1,10) do md %i 批处理自动建立目录

@hook扩展分析

@RequestMapping用法详解

[ vulhub漏洞复现篇 ] Grafana任意文件读取漏洞CVE-2021-43798

相关文章：