Tuesday, March 10, 2009

Paroli load time

This days I wanted to profile the paroli application, there are some issues with the scrolling and loading time. So I first took the loading time issue and tried to solve it. Right now Enlightenment takes around 7 seconds to fully load and Paroli takes around 20 seconds. As E17 loads Paroli as a startup application before loading the modules, the time GTA02 takes to show up Paroli since the moment the X server has started is around 30 seconds.

The current E17 config we have is a default one, we basically load every module for E17 even if it is not used for Paroli. After tweaking the configuration and making illume's module to be loaded in an async way, the loading of E17 now takes half of the time, only 3.5 seconds (The new config is already committed). Sadly Paroli itself still takes too much time.

Note that all the numbers above are on a NFS system, but the relative slowness is the same for a flash filesystem.

Tuesday, March 3, 2009

Glamo Xrender Benchmark with Expedite

Yesterday I've been testing the xrender engine on evas using the current EXA acceleration found in glamo (that is: solid fills and surface blitting). Sadly the test was taking ages to finish and even after walking up and leaving it the whole night it didnt finish but hang on the text test.

So, i wanted to test just the glue found on XRender and the implementation of it using EXA, but without painting anything, just memory moves from system memory to VRAM and the neccesary logic found on the Evas' Xrender engine. So I "removed" (just return TRUE) the functions from the xf86-video-glamo driver and .... here are the results:










BenchmarkSoftware X11XRender without painting
Image Blend Unscaled2.76????
Image Blend Solid Unscaled12.6913.72
Image Blend Nearest Scaled1.5618.14
Image Blend Nearest Solid Scaled8.7718.00
Image Blend Smooth Scaled0.4518.22
Image Blend Smooth Solid Scaled5.9317.59
Image Blend Nearest Same Scaled5.0221.26
Image Blend Nearest Solid Same Scaled 22.0517.73
Image Blend Smooth Same Scaled1.2720.96
Image Blend Smooth Solid Same Scaled11.8417.76
Image Blend Border0.511.83
Image Blend Solid Border6.671.97
Image Blend Border Recolor0.441.23
Image Quality Scale4.29 1.97
Image Data ARGB7.223.71
Image Data ARGB Alpha4.89 1.70
Image Data YCbCr 601 Pointer List6.543.16
Image Data YCbCr 601 Pointer List Wide Stride6.045.40
Image Crossfade6.674.61
Text Basic9.282.25
Text Styles1.050.17
Text Styles Different Strings0.790.14
Text Change5.641.86
Textblock Basic5.671.50
Textblock Intl4.672.46
Rect Blend1.819.66
Rect Solid9.5718.02
Rect Blend Few69.84?????
Rect Solid Few84.2261.79
Image Blend Occlude 1 Few41.09196.75
Image Blend Occlude 2 Few24.0047.37
Image Blend Occlude 3 Few17.5070.32
Image Blend Occlude 143.2626.20
Image Blend Occlude 214.5914.03
Image Blend Occlude 34.8721.06
Image Blend Occlude 1 Many27.3112.14
Image Blend Occlude 2 Many6.814.61
Image Blend Occlude 3 Many2.21????
Image Blend Occlude 1 Very Many3.791.54
Image Blend Occlude 2 Very Many0.660.43
Image Blend Occlude 3 Very Many0.360.58
Polygon Blend3.511.69
EVAS SPEED11.8618.66



The results are very disappointing, there are several places where drawing on software is better than just doing the logic on XRender/EXA to achieve the same result but without drawing. And in the tests where XRender/EXA is better the speed up doesn't worth as the drawing will be for sure slower. Note that the Glamo chip can only do raster operations into a destination surface of format RGB565, which means that there wont be any acceleration even if the blending is possible on hardware as Evas uses ARGB8888 premul.

Then, how to improve the speed of the rendering on Evas specifically for this chip? The path through XRender/EXA is worthless, is there any other way? Well. one possibility we could use, is to use the Evas' software_16 engine (a destination surface of format RGB565) to reduce the bandwidth needed, but how to match that with the XRender API?

Another solution could be to leave the efforts on xf86-video-glamo acceleration and just build a specific Evas engine for glamo. Mmap the whole framebuffer memory and manage it through Eina's memory pool manager, handle the surfaces ourselves and do a mix between software_16 and this specific engine. A lot of work, yes, but looks like the only solution (X away) that can give us some results. But there's a problem, how to send the changes into the displayed X window? because in our engine we'll use a VRAM backbuffer and we can't know from a X client the phyisical memory of the area the window is being drawn. So we'll have a roundtrip here, physical memory (our glamo surface) -> virtual memory (Xshm/X memory) -> physical memory (destination framebuffer), that for sure will remove any speedup.

Suggestions?

Sunday, March 1, 2009

Initial Benchmark of the xf86-video-glamo on GTA02

After my tremendous problems building my build environment I have finally succeed :)
So I had the chance to give xorg-video-glamo a try and see how well it behaves. To do the benchmark I used expedite and the results are:


2.76 , Image Blend Unscaled
12.69 , Image Blend Solid Unscaled
1.56 , Image Blend Nearest Scaled
8.77 , Image Blend Nearest Solid Scaled
0.45 , Image Blend Smooth Scaled
5.93 , Image Blend Smooth Solid Scaled
5.02 , Image Blend Nearest Same Scaled
22.05 , Image Blend Nearest Solid Same Scaled
1.27 , Image Blend Smooth Same Scaled
11.84 , Image Blend Smooth Solid Same Scaled
0.51 , Image Blend Border
6.67 , Image Blend Solid Border
0.44 , Image Blend Border Recolor
4.29 , Image Quality Scale
7.22 , Image Data ARGB
4.89 , Image Data ARGB Alpha
6.54 , Image Data YCbCr 601 Pointer List
6.04 , Image Data YCbCr 601 Pointer List Wide Stride
6.67 , Image Crossfade
9.28 , Text Basic
1.05 , Text Styles
0.79 , Text Styles Different Strings
5.64 , Text Change
5.67 , Textblock Basic
4.67 , Textblock Intl
1.81 , Rect Blend
9.57 , Rect Solid
69.84 , Rect Blend Few
84.22 , Rect Solid Few
41.09 , Image Blend Occlude 1 Few
24.00 , Image Blend Occlude 2 Few
17.50 , Image Blend Occlude 3 Few
43.26 , Image Blend Occlude 1
14.59 , Image Blend Occlude 2
4.87 , Image Blend Occlude 3
27.31 , Image Blend Occlude 1 Many
6.81 , Image Blend Occlude 2 Many
2.21 , Image Blend Occlude 3 Many
3.79 , Image Blend Occlude 1 Very Many
0.66 , Image Blend Occlude 2 Very Many
0.36 , Image Blend Occlude 3 Very Many
3.51 , Polygon Blend
11.86 , EVAS SPEED

The benchmark took around half an hour to end and from the final EVAS SPEED value, it is really, really slow.

Note that right now the driver is just a wrapper on top of the fbdev, so no acceleration is coded yet, only software based rendering and giving that the CPU isn't that fast either, there's no surprise on the benchmark.

The Xrender acceleration is one of the possibilities to improve the performance and the good news is that Evas already provides a Xrender based engine. So let's get the hands dirty and start hacking the driver! :)