如何利用MapReduce框架实现HBase的数据读写操作？？

2025-06-15 11:04 问答作者：泡妞宝典

java，import java.io.IOException;，import org.apache.hadoop.conf.Configuration;，import org.apache.hadoop.hbase.HBaseConfiguration;，import org.apache.hadoop.hbase.TableName;，import org.apache.hadoop.hbase.client.Connection;，import org.apache.hadoop.hbase.client.ConnectionFactory;，import org.apache.hadoop.hbase.client.Get;，import org.apache.hadoop.hbase.client.Put;，import org.apache.hadoop.hbase.client.Table;，import org.apache.hadoop.hbase.util.Bytes;，，public class HBaseExample {，    public static void main(String[] args) throws IOException {，        Configuration config = HBaseConfiguration.create();，        Connection connection = ConnectionFactory.createConnection(config);，        Table table = connection.getTable(TableName.valueOf("test"));，，        // 写入数据，        Put put = new Put(Bytes.toBytes("row1"));，        put.addColumn(Bytes.toBytes("col1"), Bytes.toBytes("qual1"), Bytes.toBytes("value1"));，        table.put(put);，，        // 读取数据，        Get get = new Get(Bytes.toBytes("row1"));，        Result result = table.get(get);，        byte[] value = result.getValue(Bytes.toBytes("col1"), Bytes.toBytes("qual1"));，        System.out.println("Value: " + Bytes.toString(value));，，        table.（本文来源：wWw.KengNiao.Com）close();，        connection.close();，    }，}，

``，，这个程序首先连接到HBase，然后向表"test"中插入一行数据，接着从表中读取该行数据并打印出来。

MapReduce 读写 HBase 数据样例程序

（图片来源网络，侵删）

1. 准备工作

在开始编写 MapReduce 程序之前，确保你已经安装了 Hadoop 和 HBase，你需要了解基本的 HBase 表结构和数据模型。

2. 创建 HBase 表

我们需要创建一个 HBase 表来存储我们的示例数据，假设我们要创建一个名为user_data 的表，包含两个列族：info 和address。

create 'user_data', {NAME => 'info'}, {NAME => 'address'}

3. 编写 MapReduce 程序

如何利用MapReduce框架实现HBase的数据读写操作？？

（图片来源网络，侵删）

3.1 Mapper 类

import org.apache.hadoop.hbase.client.Put;import org.apache.hadoop.hbase.io.ImmutableBytesWritable;import org.apache.hadoop.hbase.util.Bytes;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Mapper;public class HBaseWriteMapper extends Mapper<LongWritable, Text, ImmutableBytesWritable, Put> {    @Override    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {        String[] fields = value.toString().split("\t");        if (fields.length != 4) return;        String rowKey = fields[0];        String name = fields[1];        String age = fields[2];        String address = fields[3];        ImmutableBytesWritable hbaseRowKey = new ImmutableBytesWritable(Bytes.toBytes(rowKey));        Put put = new Put(hbaseRowKey.get());        put.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"), Bytes.toBytes(name));        put.addColumn(Bytes.toBytes("info"), Bytes.toBytes("age"), Bytes.toBytes(age));        put.addColumn(Bytes.toBytes("address"), Bytes.toBytes("city"), Bytes.toBytes(address));        context.write(hbaseRowKey, put);    }}

3.2 Reducer 类

在这个例子中，我们不需要使用 Reducer，因为我们只是将数据写入 HBase，我们可以省略 Reducer 类。

3.3 Driver 类

import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.hbase.HBaseConfiguration;import org.apache.hadoop.hbase.client.Connection;import org.apache.hadoop.hbase.client.ConnectionFactory;import org.apache.hadoop.hbase.client.Table;import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;import org.apache.hadoop.hbase.mapreduce.TableOutputFormat;import org.apache.hadoop.hbase.mapreduce.TableReducer;import org.apache.hadoop.hbase.util.Bytes;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class HBaseWriteDriver {    public static void main(String[] args) throws Exception {        Configuration conf = HBaseConfiguration.create();        conf.set("hbase.zookeeper.quorum", "localhost"); // 设置 ZooKeeper 地址        conf.set("hbase.zookeeper.property.clientPort", "2181"); // 设置 ZooKeeper 端口        conf.set(TableOutputFormat.OUTPUT_TABLE, "user_data"); // 设置输出表名        Job job = Job.getInstance(conf, "HBase Write Example");        job.setJarByClass(HBaseWriteDriver.class);        job.setMapperClass(HBaseWriteMapper.class);        job.setNumReduceTasks(0); // 不使用 Reducer        FileInputFormat.addInputPath(job, new Path(args[0])); // 输入文件路径        FileOutputFormat.setOutputPath(job, new Path(args[1])); // 输出文件路径        System.exit(job.waitForCompletion(true) ? 0 : 1);    }}

4. 运行程序

如何利用MapReduce框架实现HBase的数据读写操作？？

（图片来源网络，侵删）

编译并打包你的 Java 代码后，你可以使用以下命令运行 MapReduce 程序：

hadoop jar yourcompiledjarfile.jar com.example.HBaseWriteDriver inputpath outputpath

其中yourcompiledjarfile.jar 是你的编译后的 JAR 文件，inputpath 是包含输入数据的 HDFS 路径，outputpath 是用于存储 MapReduce 输出结果的 HDFS 路径。

5. 问题与解答

问题1：如何从 HBase 表中读取数据？

答案1： 要从 HBase 表中读取数据，你可以使用 HBase 的 API 或者 MapReduce 作业，以下是一个简单的使用 HBase API 读取数据的示例：

import org.apache.hadoop.hbase.*;import org.apache.hadoop.hbase.client.*;import org.apache.hadoop.hbase.util.*;public class HBaseReadExample {    public static void main(String[] args) throws Exception {        Configuration config = HBaseConfiguration.create();        config.set("hbase.zookeeper.quorum", "localhost");        config.set("hbase.zookeeper.property.clientPort", "2181");        Connection connection = ConnectionFactory.createConnection(config);        Table table = connection.getTable(TableName.valueOf("user_data"));        Get get = new Get(Bytes.toBytes("rowKey1")); // 替换为你要查询的行键        Result result = table.get(get);        byte[] value = result.getValue(Bytes.toBytes("info"), Bytes.toBytes("name"));        System.out.println("Name: " + Bytes.toString(value));        table.close();        connection.close();    }}

问题2：如何在 HBase 中使用过滤器进行数据查询？

答案2： 在 HBase 中，你可以使用过滤器来筛选返回的数据，如果你想获取年龄大于等于30的用户信息，可以使用 SingleColumnValueFilter，以下是一个简单的示例：

import org.apache.hadoop.hbase.*;import org.apache.hadoop.hbase.client.*;import org.apache.hadoop.hbase.filter.*;import org.apache.hadoop.hbase.util.*;public class HBaseFilterExample {    public static void main(String[] args) throws Exception {        Configuration config = HBaseConfiguration.create();        config.set("hbase.zookeeper.quorum", "localhost");        config.set("hbase.zookeeper.property.clientPort", "2181");        Connection connection = ConnectionFactory.createConnection(config);        Table table = connection.getTable(TableName.valueOf("user_data"));        Scan scan = new Scan();        Filter filter = new SingleColumnValueFilter(Bytes.toBytes("info"), Bytes.toBytes("age"), CompareFilter.CompareOp.GREATER_OR_EQUAL, Bytes.toBytes("30"));        scan.setFilter(filter);        ResultScanner scanner = table.getScanner(scan);        for (Result result : scanner) {            System.out.println("Row: " + Bytes.toString(result.getRow()) + " Age: " + Bytes.toString(result.getValue(Bytes.toBytes("info"), Bytes.toBytes("age"))));        }        scanner.close();        table.close();        connection.close();    }}

继续阅读：hbase mapreduce 操作数据

如何利用MapReduce框架实现HBase的数据读写操作？？

更多精彩内容

精彩评论

最新问答

腾讯云活动机器使用优惠券？

帝国cms手机验证码二次登陆？

Red Hat Enterprise Linux release 9.0 安装mysql失败？

在腾讯云上能干什么的？

腾讯云的服务器怎么改端口？

问答排行榜

老赵孙雨菲董茹小说肆意沦陷全文在线阅读,主角人物孙雨菲老赵全部章节

印度人口超过我国了吗中国人口到了2040预计有多少人？

护墙板是什么材料

小说名字陈大海林倩仪装车工,女经理的迷失主角林清仪小说在线阅读

快乐8选9中4是多少钱（需要看彩票的实际数额来计算）[多少钱]？

更多精彩内容

精彩评论

最新问答

腾讯云活动机器使用优惠券？

帝国cms手机验证码二次登陆？

Red Hat Enterprise Linux release 9.0 安装mysql失败？

在腾讯云上能干什么的？

腾讯云的服务器怎么改端口？

问答排行榜

老赵孙雨菲董茹小说肆意沦陷全文在线阅读,主角人物孙雨菲老赵全部章节

印度人口超过我国了吗 中国人口到了2040预计有多少人？

护墙板是什么材料

小说名字陈大海林倩仪装车工,女经理的迷失主角林清仪小说在线阅读

快乐8选9中4是多少钱（需要看彩票的实际数额来计算）[多少钱]？

印度人口超过我国了吗中国人口到了2040预计有多少人？