Renderscript处理Java for循环和C指针的内存访问时,突然神秘地变得慢了很多指针、慢了、内存、突然

由网友(捂住脸的指缝在下雨)分享简介:我用的Andr​​oid 4.3的Nexus 4设备工作。这个问题也可以在联想K900转载运行Android 4.2.2。I'm working with Nexus 4 device with Android 4.3. This issue can also be reproduced in Lenovo K900...

我用的Andr​​oid 4.3的Nexus 4设备工作。这个问题也可以在联想K900转载运行Android 4.2.2。

I'm working with Nexus 4 device with Android 4.3. This issue can also be reproduced in Lenovo K900 running Android 4.2.2.

的的code不是在GPU上运行,只是在CPU上运行,因为我查亚行的CPU使用率和它显示在运行该程序的CPU使用率超过90%。的

粘贴code之前,我试着总结我遇到了问题。在我的项目,我需要不断地处理图像(或图像)和处理结果存储到另一个缓冲区。当我使用该算法的性质,我需要(同时处理不同的图像行)并行通过图像行的图像处理操作。为了这一点,我创建了唯一的行索引的分配和使用该分配调用函数的foreach。我还创建在RS侧的全局指针和另一个1D分配绑定到它在Java侧,使得RS code可以使用该指针写入的结果到缓冲器。同时,我还需要执行的foreach功能很多次每次运行。因此调用Java中的的foreach功能的时候,我把它放在一个在Java端循环。然而,我遇到了一些很奇怪。让我先贴code。

Before pasting the code, I try to summarize the problem I met. In my project, I'll need to continuously process an image (or images) and store the processed result into another buffer. By the nature of the algorithm I use, I need to parallelize the image processing operation by image rows (process different image rows simultaneously). In order to this, I created an Allocation with only row index and use this Allocation to call the foreach function. I also created a global pointer in the RS side and bind another 1D Allocation to it in the Java side so that the RS code can use this pointer to write the result to the buffer. Meanwhile, I also need to execute the foreach function many times for each run. So when calling the foreach function in Java, I put it in a for loop in the Java side. However, I met something quite strange. Let me paste the code first.

在MainActivity.java:

In MainActivity.java:

package com.example.slowrs;

import java.io.IOException;
import java.io.InputStream;

import com.example.slowrs.R;

import android.os.Bundle;
import android.renderscript.Allocation;
import android.renderscript.Element;
import android.renderscript.RenderScript;
import android.renderscript.Type;
import android.app.Activity;
import android.content.res.AssetManager;
import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import android.util.Log;
import android.view.Menu;
import android.view.View;
import android.widget.*;
import android.renderscript.*;

public class MainActivity extends Activity implements Button.OnClickListener{
    private Bitmap mBitmap;
    private RenderScript mRS;
    private ScriptC_test mTestScript;

    private Allocation mImgAlloc;
    private Allocation mRowAlloc;

    private TextView mTextView;
    private ImageView imgView;

    private String TAG = "test";

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        mRS = RenderScript.create(this);

        mBitmap = getImageFromAssetsFile("input.png");
        imgView = (ImageView)findViewById(R.id.display);
        imgView.setImageBitmap(mBitmap);
        imgView.setOnClickListener(this);

        mTextView = (TextView)findViewById(R.id.label);

        mImgAlloc = Allocation.createFromBitmap(mRS, mBitmap, Allocation.MipmapControl.MIPMAP_NONE, Allocation.USAGE_SCRIPT);
        Type.Builder tb = new Type.Builder(mRS, Element.U8(mRS));
        tb.setX(1); tb.setY(mImgAlloc.getType().getY());
        Type type = tb.create();
        // Parallelize w.r.t this
        mRowAlloc = Allocation.createTyped(mRS, type, Allocation.USAGE_SCRIPT);

        Type.Builder tb1 = new Type.Builder(mRS, Element.I32(mRS));
        tb1.setX(mImgAlloc.getType().getX()*mImgAlloc.getType().getY()); tb1.setY(1);
        Type type1 = tb1.create();
        Allocation newBufferAlloc = Allocation.createTyped(mRS, type1, Allocation.USAGE_SCRIPT);

        mTestScript = new ScriptC_test(mRS, getResources(), R.raw.test);
        mTestScript.set_image(mImgAlloc);
        mTestScript.bind_buffer(newBufferAlloc);
        mTestScript.set_imgWidth(mImgAlloc.getType().getX());
    }

    public void onClick(View v) {  
        // TODO Auto-generated method stub  
        Log.i(TAG, "touched");

        long timeBeforeExe = System.nanoTime();         

        for(int i = 0; i < 150; i++){
            mTestScript.forEach_slowTest(mRowAlloc);
        }

        long ct = System.nanoTime();
        long offset = ct - timeBeforeExe;
        float offsetInMs = (float)(offset)/1000000;
        mTextView.setText("Time: " + Float.toString(offsetInMs) + "ms");
    }

    @Override
    public boolean onCreateOptionsMenu(Menu menu) {
        // Inflate the menu; this adds items to the action bar if it is present.
        getMenuInflater().inflate(R.menu.main, menu);
        return true;
    }

    private Bitmap getImageFromAssetsFile(String fileName)
    {  
        Bitmap image = null;  
        AssetManager am = getResources().getAssets();  
        try  
        {  
            InputStream is = am.open(fileName);  
            image = BitmapFactory.decodeStream(is);  
            is.close();  
        }  
        catch (IOException e)  
        {  
            e.printStackTrace();  
        }  

        return image;
    }
}

在test.rs:

#pragma version(1)
#pragma rs java_package_name(com.example.slowrs)
#pragma rs_fp_relaxed

int* buffer;
rs_allocation image;
int imgWidth;

void __attribute__((kernel)) slowTest(uchar in, uint32_t x, uint32_t y){
    for(int col = 0; col < imgWidth; col++){
        const uchar4 rightImgNextPixel = *(const uchar4*)rsGetElementAt(image, col, y);
        buffer[y * imgWidth + col] = rightImgNextPixel.x + 10;      
        //buffer[y * imgWidth + col] = 10;
    }
}

在activity_main.xml中(布局)

In activity_main.xml (the layout)

<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:paddingBottom="@dimen/activity_vertical_margin"
    android:paddingLeft="@dimen/activity_horizontal_margin"
    android:paddingRight="@dimen/activity_horizontal_margin"
    android:paddingTop="@dimen/activity_vertical_margin"
    tools:context=".MainActivity" >

    <LinearLayout 
            android:orientation="vertical"
            android:layout_width="fill_parent"
            android:layout_height="fill_parent"
            android:id="@+id/toplevel">

        <ImageView
            android:id="@+id/display"
            android:layout_width="320dip"
            android:layout_height="266dip" />

        <TextView
            android:id="@+id/label"
            android:layout_height="wrap_content"
            android:layout_width="fill_parent"
            android:text="Time:"
            android:padding="2dp"
            android:textSize="16sp"
            android:gravity="center" />
    </LinearLayout>
</RelativeLayout>

这三个文件,我粘贴包含的一切重现此问题。基本上,我在code所做的就是将图像加载到分配并在屏幕上显示。一旦图像被窃听,onClick的()函数运行,并在foreach函数被调用。

The three files I pasted contains everything to reproduce this issue. Basically, what I did in the code is to load an image to the Allocation and display it on the screen. Once the image is tapped, the onClick() function runs and the foreach function is called.

input.png

只是一个普通的640 * 480的PNG文件,我把该项目的资产文件夹中。具有相同大小的任何图像都可以。

is just a normal 640*480 png file I put in the assets folder of the project. Any image with the same size will do.

我遇到的问题是如下。当我轻轻拍打图像(约每秒钟一次),一切都是文件,在用户界面上的文字显示了整个图像处理过程很快完成(在几个毫秒)。但是,如果我点击图像快(快就可以了,基本上5〜6抽头第二),事情发生了变化。在UI上的文字显示,一些水龙头时间超过500毫秒才能完成(在Nexus 4),而其他仍然需要几毫秒。从我所看到的,传球慢比快传,这是奇怪的慢100倍以上。

The problem I met is the following. When I gently tap the image (about once a second), everything is file, text on the UI shows the whole image processing procedure finishes very quickly (in several ms). However, if I tap the image faster (as fast as you can, basically 5 to 6 taps a second), things changed. Text on UI shows that some taps takes more than 500ms to finish (on Nexus 4) while others still takes several ms. From what I see, the slower pass is more than 100 times slower than the fast pass, which is strange.

一些测试后,我发现两件事情会让这突如其来的减速走开。我要么做

After some test, I found two things would make this sudden slow down go away. I either do

for(int i = 0; i < 1; i++){
    mTestScript.forEach_slowTest(mRowAlloc);
}

即使在Java循环或较小,

namely make the for loop smaller in Java or,

void __attribute__((kernel)) slowTest(uchar in, uint32_t x, uint32_t y){
    for(int col = 0; col < imgWidth; col++){
        const uchar4 rightImgNextPixel = *(const uchar4*)rsGetElementAt(image, col, y);
        //buffer[y * imgWidth + col] = rightImgNextPixel.x + 10;        
        buffer[y * imgWidth + col] = 10;
    }
}

没有提及rightImgNextPixel.x在设置在缓冲器中的新值。无论是两个将使放缓消失。你可以自己测试一下。但是,我无法解释为什么他们俩的。

do not refer to rightImgNextPixel.x in setting the new value in buffer. Either of the two will make the slow down disappear. You may test it yourself. However, I can't explain why for either of them.

这是怎么回事?这个问题让我疯了,严重影响了图像处理任务的性能。请帮忙,谢谢!

What's happening? This issue is making me crazy and seriously affecting the performance of the image processing task. Please help, thank you!

推荐答案

您不能测量实际的执行时间。尝试添加rs.finish()或读取结果,从你的操作了。 RS是异步,直到缓冲区填满或需要的结果是排队的操作。因此内核的环路启动只是获取排队。

You are not measuring the actual execution time. Try adding rs.finish() or read the results back from your operation. RS is async, it queues up operations until the buffers fill up or a result is needed. Thus the loop of kernel launches just gets queued up.

相关,我会建议使用的返回值从内核写入输出缓冲区或r​​sSetElementAt_uchar4,而不是绑定一个全球性的指针。 RS不使保证约2D存储器的布局和在某些情况下,这code不会产生正确的结果,由于存储器的步幅不同于所述宽度不同

Related, I would suggest using the return value from the kernel to write the output buffer or rsSetElementAt_uchar4 rather than binding a global pointer. RS does not make guarantees about the layout of 2D memory and in some cases this code will not generate the correct result due to the stride of the memory being different from the width.

阅读全文

相关推荐

最新文章