#28 - support PCI hole resize in qemu-xen

Owner: Anthony PERARD <anthony.perard@citrix.com>

Date: Wed Jan 8 12:30:01 2014

Last Update: Wed Jan 8 13:15:02 2014

Severity: normal

Affects:

State: Open

[ Retrieve as mbox ]


From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
To: Hanweidong <hanweidong@huawei.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Yangxiaowei <xiaowei.yang@huawei.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Anthony Perard <anthony.perard@citrix.com>, Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>, "xudong.hao@intel.com" <xudong.hao@intel.com>, Yanqiangjun <yanqiangjun@huawei.com>, Wangzhenguo <wangzhenguo@huawei.com>, Luonengjun <luonengjun@huawei.com>, "Gonglei \(Arei\)" <arei.gonglei@huawei.com>, "xiantao.zhang@intel.com" <xiantao.zhang@intel.com>
Subject: Re: [Xen-devel] GPU passthrough issue when VM is configured with 4G memory
Date: Wed, 29 May 2013 17:18:24 +0100
Message-ID: <alpine.DEB.2.02.1305291701580.4799@kaball.uk.xensource.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

[Part 1 (text/plain, inline)]
On Thu, 25 Apr 2013, Hanweidong wrote:
> > -----Original Message-----
> > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > bounces@lists.xen.org] On Behalf Of Hanweidong
> > Sent: 2013年3月26日 17:38
> > To: Stefano Stabellini
> > Cc: George Dunlap; xudong.hao@intel.com; Yanqiangjun; Luonengjun;
> > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > devel@lists.xen.org; xiantao.zhang@intel.com
> > Subject: Re: [Xen-devel] GPU passthrough issue when VM is configured
> > with 4G memory
> > 
> > 
> > > -----Original Message-----
> > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > Sent: 2013年3月18日 20:02
> > > To: Hanweidong
> > > Cc: George Dunlap; Stefano Stabellini; Yanqiangjun; Luonengjun;
> > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > > devel@lists.xen.org; xudong.hao@intel.com; xiantao.zhang@intel.com
> > > Subject: RE: [Xen-devel] GPU passthrough issue when VM is configured
> > > with 4G memory
> > >
> > > On Wed, 13 Mar 2013, Hanweidong wrote:
> > > > MMIO HOLE was adjusted to e0000000 - fc000000. But QEMU uses below
> > > code to init
> > > > RAM in xen_ram_init:
> > > >
> > > >     ...
> > > >     block_len = ram_size;
> > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > >         /* Xen does not allocate the memory continuously, and keep
> > a
> > > hole at
> > > >          * HVM_BELOW_4G_MMIO_START of HVM_BELOW_4G_MMIO_LENGTH
> > > >          */
> > > >         block_len += HVM_BELOW_4G_MMIO_LENGTH;
> > > >     }
> > > >     memory_region_init_ram(&ram_memory, "xen.ram", block_len);
> > > >     vmstate_register_ram_global(&ram_memory);
> > > >
> > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > >         above_4g_mem_size = ram_size - HVM_BELOW_4G_RAM_END;
> > > >         below_4g_mem_size = HVM_BELOW_4G_RAM_END;
> > > >     } else {
> > > >         below_4g_mem_size = ram_size;
> > > >     }
> > > >     ...
> > > >
> > > > HVM_BELOW_4G_RAM_END is f0000000. If we change HVM_BELOW_4G_RAM_END
> > > to e0000000,
> > > > Which it's consistent with hvmloader when assigning a GPU, and then
> > > guest worked
> > > > for us. So we wondering that xen_ram_init in QEMU should be
> > > consistent with
> > > > hvmloader.
> > > >
> > > > In addition, we found QEMU uses hardcode 0xe0000000 in pc_init1()
> > as
> > > below.
> > > > Should keep these places handle the consistent mmio hole or not?
> > > >
> > > >     if (ram_size >= 0xe0000000 ) {
> > > >         above_4g_mem_size = ram_size - 0xe0000000;
> > > >         below_4g_mem_size = 0xe0000000;
> > > >     } else {
> > > >         above_4g_mem_size = 0;
> > > >         below_4g_mem_size = ram_size;
> > > >     }
> > >
> > > The guys at Intel sent a couple of patches recently to fix this issue:
> > >
> > > http://marc.info/?l=xen-devel&m=136150317011027
> > > http://marc.info/?l=qemu-devel&m=136177475215360&w=2
> > >
> > > Do they solve your problem?
> > 
> > These two patches didn't solve our problem.
> > 
> 
> I debugged this issue with above two patches. I want to share some information and discuss solution here. This issue is actually caused by that a VM has a large pci hole (mmio size) which results in QEMU sets memory regions inconsistently with hvmloader (QEMU uses hardcode 0xe0000000 in pc_init1 and xen_ram_init). I created a virtual device with 1GB mmio size to debug this issue. Firstly, QEMU set memory regions except pci hole region in pc_init1() and xen_ram_init(), then hvmloader calculated pci_mem_start as 0x80000000, and wrote it to TOM register, which triggered QEMU to update pci hole region with 0x80000000 using i440fx_update_pci_mem_hole(). Finally the windows 7 VM (configured 8G) crashed with BSOD code 0x00000024. If I hardcode in QEMU pc_init1 and xen_ram_init to match hvmloader's. Then the problem was gone. 
> 
> Althrough above two patches will pass actual pci hole start address to QEMU, but it's too late, QEMU pc_init1() and xen_ram_init() already set the other memory regions, and obviously the pci hole might overlap with ram regions in this case. So I think hvmloader should setup pci devices and calculate pci hole first, then QEMU can map memory regions correctly from the beginning.  
> 

Thank you very much for your detailed analysis of the problem.

After reading this, I wonder how is possible that qemu-xen-traditional
does not have this issue, considering that AFAIK there is no way for
hvmloader to tell qemu-xen-traditional where the PCI hole starts.

The only difference between upstream QEMU and qemu-xen-traditional is
that the former would start the PCI hole at 0xf0000000 while the latter
would start the PCI hole at 0xe0000000.

So I would expect that your test, where hvmloader is updating the PCI
hole region to start at 0x80000000, would fail on qemu-xen-traditional
too.

Of course having the PCI hole starting unconditionally at 0xf0000000
makes it much easier to run into problems than starting it at
0xe0000000.


Assuming that everything above is correct, this is what I would do:

1) modify upstream QEMU to start the PCI hole at 0xe0000000, to match
qemu-xen-unstable in terms of configuration and not to introduce any
regressions. Do this for the Xen 4.3 release.

2) for Xen 4.4 rework the two patches above and improve
i440fx_update_pci_mem_hole: resizing the pci_hole subregion is not
enough, it also needs to be able to resize the system memory region
(xen.ram) to make room for the bigger pci_hole
[Part 2 (text/plain, inline)]

From: Hanweidong <hanweidong@huawei.com>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Yangxiaowei <xiaowei.yang@huawei.com>, George Dunlap <George.Dunlap@eu.citrix.com>, "xudong.hao@intel.com" <xudong.hao@intel.com>, Yanqiangjun <yanqiangjun@huawei.com>, Anthony Perard <anthony.perard@citrix.com>, Luonengjun <luonengjun@huawei.com>, Wangzhenguo <wangzhenguo@huawei.com>, "Gonglei \(Arei\)" <arei.gonglei@huawei.com>, "xiantao.zhang@intel.com" <xiantao.zhang@intel.com>
Subject: Re: [Xen-devel] GPU passthrough issue when VM is configured with 4G memory
Date: Thu, 30 May 2013 01:29:40 +0000
Message-ID: <FAB5C136CA8BEA4DBEA2F641E3F5363863BB0B2C@szxeml538-mbx.china.huawei.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

> -----Original Message-----
> From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> Sent: 2013年5月30日 0:18
> To: Hanweidong
> Cc: Stefano Stabellini; George Dunlap; xudong.hao@intel.com;
> Yanqiangjun; Luonengjun; Wangzhenguo; Yangxiaowei; Gonglei (Arei);
> Anthony Perard; xen-devel@lists.xen.org; xiantao.zhang@intel.com
> Subject: RE: [Xen-devel] GPU passthrough issue when VM is configured
> with 4G memory
> 
> On Thu, 25 Apr 2013, Hanweidong wrote:
> > > -----Original Message-----
> > > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > > bounces@lists.xen.org] On Behalf Of Hanweidong
> > > Sent: 2013年3月26日 17:38
> > > To: Stefano Stabellini
> > > Cc: George Dunlap; xudong.hao@intel.com; Yanqiangjun; Luonengjun;
> > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > > devel@lists.xen.org; xiantao.zhang@intel.com
> > > Subject: Re: [Xen-devel] GPU passthrough issue when VM is
> configured
> > > with 4G memory
> > >
> > >
> > > > -----Original Message-----
> > > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > > Sent: 2013年3月18日 20:02
> > > > To: Hanweidong
> > > > Cc: George Dunlap; Stefano Stabellini; Yanqiangjun; Luonengjun;
> > > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > > > devel@lists.xen.org; xudong.hao@intel.com;
> xiantao.zhang@intel.com
> > > > Subject: RE: [Xen-devel] GPU passthrough issue when VM is
> configured
> > > > with 4G memory
> > > >
> > > > On Wed, 13 Mar 2013, Hanweidong wrote:
> > > > > MMIO HOLE was adjusted to e0000000 - fc000000. But QEMU uses
> below
> > > > code to init
> > > > > RAM in xen_ram_init:
> > > > >
> > > > >     ...
> > > > >     block_len = ram_size;
> > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > >         /* Xen does not allocate the memory continuously, and
> keep
> > > a
> > > > hole at
> > > > >          * HVM_BELOW_4G_MMIO_START of HVM_BELOW_4G_MMIO_LENGTH
> > > > >          */
> > > > >         block_len += HVM_BELOW_4G_MMIO_LENGTH;
> > > > >     }
> > > > >     memory_region_init_ram(&ram_memory, "xen.ram", block_len);
> > > > >     vmstate_register_ram_global(&ram_memory);
> > > > >
> > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > >         above_4g_mem_size = ram_size - HVM_BELOW_4G_RAM_END;
> > > > >         below_4g_mem_size = HVM_BELOW_4G_RAM_END;
> > > > >     } else {
> > > > >         below_4g_mem_size = ram_size;
> > > > >     }
> > > > >     ...
> > > > >
> > > > > HVM_BELOW_4G_RAM_END is f0000000. If we change
> HVM_BELOW_4G_RAM_END
> > > > to e0000000,
> > > > > Which it's consistent with hvmloader when assigning a GPU, and
> then
> > > > guest worked
> > > > > for us. So we wondering that xen_ram_init in QEMU should be
> > > > consistent with
> > > > > hvmloader.
> > > > >
> > > > > In addition, we found QEMU uses hardcode 0xe0000000 in
> pc_init1()
> > > as
> > > > below.
> > > > > Should keep these places handle the consistent mmio hole or not?
> > > > >
> > > > >     if (ram_size >= 0xe0000000 ) {
> > > > >         above_4g_mem_size = ram_size - 0xe0000000;
> > > > >         below_4g_mem_size = 0xe0000000;
> > > > >     } else {
> > > > >         above_4g_mem_size = 0;
> > > > >         below_4g_mem_size = ram_size;
> > > > >     }
> > > >
> > > > The guys at Intel sent a couple of patches recently to fix this
> issue:
> > > >
> > > > http://marc.info/?l=xen-devel&m=136150317011027
> > > > http://marc.info/?l=qemu-devel&m=136177475215360&w=2
> > > >
> > > > Do they solve your problem?
> > >
> > > These two patches didn't solve our problem.
> > >
> >
> > I debugged this issue with above two patches. I want to share some
> information and discuss solution here. This issue is actually caused by
> that a VM has a large pci hole (mmio size) which results in QEMU sets
> memory regions inconsistently with hvmloader (QEMU uses hardcode
> 0xe0000000 in pc_init1 and xen_ram_init). I created a virtual device
> with 1GB mmio size to debug this issue. Firstly, QEMU set memory
> regions except pci hole region in pc_init1() and xen_ram_init(), then
> hvmloader calculated pci_mem_start as 0x80000000, and wrote it to TOM
> register, which triggered QEMU to update pci hole region with
> 0x80000000 using i440fx_update_pci_mem_hole(). Finally the windows 7 VM
> (configured 8G) crashed with BSOD code 0x00000024. If I hardcode in
> QEMU pc_init1 and xen_ram_init to match hvmloader's. Then the problem
> was gone.
> >
> > Althrough above two patches will pass actual pci hole start address
> to QEMU, but it's too late, QEMU pc_init1() and xen_ram_init() already
> set the other memory regions, and obviously the pci hole might overlap
> with ram regions in this case. So I think hvmloader should setup pci
> devices and calculate pci hole first, then QEMU can map memory regions
> correctly from the beginning.
> >
> 
> Thank you very much for your detailed analysis of the problem.
> 
> After reading this, I wonder how is possible that qemu-xen-traditional
> does not have this issue, considering that AFAIK there is no way for
> hvmloader to tell qemu-xen-traditional where the PCI hole starts.
> 
> The only difference between upstream QEMU and qemu-xen-traditional is
> that the former would start the PCI hole at 0xf0000000 while the latter
> would start the PCI hole at 0xe0000000.
> 
> So I would expect that your test, where hvmloader is updating the PCI
> hole region to start at 0x80000000, would fail on qemu-xen-traditional
> too.

Yes, I think so. 

> 
> Of course having the PCI hole starting unconditionally at 0xf0000000
> makes it much easier to run into problems than starting it at
> 0xe0000000.
> 
> 
> Assuming that everything above is correct, this is what I would do:
> 
> 1) modify upstream QEMU to start the PCI hole at 0xe0000000, to match
> qemu-xen-unstable in terms of configuration and not to introduce any
> regressions. Do this for the Xen 4.3 release.

It's a quick improvement before implementing a thorough solution.

weidong

> 
> 2) for Xen 4.4 rework the two patches above and improve
> i440fx_update_pci_mem_hole: resizing the pci_hole subregion is not
> enough, it also needs to be able to resize the system memory region
> (xen.ram) to make room for the bigger pci_hole


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
To: Hanweidong <hanweidong@huawei.com>
Cc: "xudong.hao@intel.com" <xudong.hao@intel.com>, Yanqiangjun <yanqiangjun@huawei.com>, Anthony Perard <anthony.perard@citrix.com>, Wangzhenguo <wangzhenguo@huawei.com>, Luonengjun <luonengjun@huawei.com>, "Gonglei \(Arei\)" <arei.gonglei@huawei.com>, "xiantao.zhang@intel.com" <xiantao.zhang@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Yangxiaowei <xiaowei.yang@huawei.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, George Dunlap <George.Dunlap@eu.citrix.com>
Subject: Re: [Xen-devel] GPU passthrough issue when VM is configured with 4G memoryo
Date: Thu, 30 May 2013 11:27:58 +0100
Message-ID: <alpine.DEB.2.02.1305301115210.4799@kaball.uk.xensource.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

[Part 1 (text/plain, inline)]
On Thu, 30 May 2013, Hanweidong wrote:
> > -----Original Message-----
> > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > Sent: 2013年5月30日 0:18
> > To: Hanweidong
> > Cc: Stefano Stabellini; George Dunlap; xudong.hao@intel.com;
> > Yanqiangjun; Luonengjun; Wangzhenguo; Yangxiaowei; Gonglei (Arei);
> > Anthony Perard; xen-devel@lists.xen.org; xiantao.zhang@intel.com
> > Subject: RE: [Xen-devel] GPU passthrough issue when VM is configured
> > with 4G memory
> > 
> > On Thu, 25 Apr 2013, Hanweidong wrote:
> > > > -----Original Message-----
> > > > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > > > bounces@lists.xen.org] On Behalf Of Hanweidong
> > > > Sent: 2013年3月26日 17:38
> > > > To: Stefano Stabellini
> > > > Cc: George Dunlap; xudong.hao@intel.com; Yanqiangjun; Luonengjun;
> > > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > > > devel@lists.xen.org; xiantao.zhang@intel.com
> > > > Subject: Re: [Xen-devel] GPU passthrough issue when VM is
> > configured
> > > > with 4G memory
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > > > Sent: 2013年3月18日 20:02
> > > > > To: Hanweidong
> > > > > Cc: George Dunlap; Stefano Stabellini; Yanqiangjun; Luonengjun;
> > > > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > > > > devel@lists.xen.org; xudong.hao@intel.com;
> > xiantao.zhang@intel.com
> > > > > Subject: RE: [Xen-devel] GPU passthrough issue when VM is
> > configured
> > > > > with 4G memory
> > > > >
> > > > > On Wed, 13 Mar 2013, Hanweidong wrote:
> > > > > > MMIO HOLE was adjusted to e0000000 - fc000000. But QEMU uses
> > below
> > > > > code to init
> > > > > > RAM in xen_ram_init:
> > > > > >
> > > > > >     ...
> > > > > >     block_len = ram_size;
> > > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > > >         /* Xen does not allocate the memory continuously, and
> > keep
> > > > a
> > > > > hole at
> > > > > >          * HVM_BELOW_4G_MMIO_START of HVM_BELOW_4G_MMIO_LENGTH
> > > > > >          */
> > > > > >         block_len += HVM_BELOW_4G_MMIO_LENGTH;
> > > > > >     }
> > > > > >     memory_region_init_ram(&ram_memory, "xen.ram", block_len);
> > > > > >     vmstate_register_ram_global(&ram_memory);
> > > > > >
> > > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > > >         above_4g_mem_size = ram_size - HVM_BELOW_4G_RAM_END;
> > > > > >         below_4g_mem_size = HVM_BELOW_4G_RAM_END;
> > > > > >     } else {
> > > > > >         below_4g_mem_size = ram_size;
> > > > > >     }
> > > > > >     ...
> > > > > >
> > > > > > HVM_BELOW_4G_RAM_END is f0000000. If we change
> > HVM_BELOW_4G_RAM_END
> > > > > to e0000000,
> > > > > > Which it's consistent with hvmloader when assigning a GPU, and
> > then
> > > > > guest worked
> > > > > > for us. So we wondering that xen_ram_init in QEMU should be
> > > > > consistent with
> > > > > > hvmloader.
> > > > > >
> > > > > > In addition, we found QEMU uses hardcode 0xe0000000 in
> > pc_init1()
> > > > as
> > > > > below.
> > > > > > Should keep these places handle the consistent mmio hole or not?
> > > > > >
> > > > > >     if (ram_size >= 0xe0000000 ) {
> > > > > >         above_4g_mem_size = ram_size - 0xe0000000;
> > > > > >         below_4g_mem_size = 0xe0000000;
> > > > > >     } else {
> > > > > >         above_4g_mem_size = 0;
> > > > > >         below_4g_mem_size = ram_size;
> > > > > >     }
> > > > >
> > > > > The guys at Intel sent a couple of patches recently to fix this
> > issue:
> > > > >
> > > > > http://marc.info/?l=xen-devel&m=136150317011027
> > > > > http://marc.info/?l=qemu-devel&m=136177475215360&w=2
> > > > >
> > > > > Do they solve your problem?
> > > >
> > > > These two patches didn't solve our problem.
> > > >
> > >
> > > I debugged this issue with above two patches. I want to share some
> > information and discuss solution here. This issue is actually caused by
> > that a VM has a large pci hole (mmio size) which results in QEMU sets
> > memory regions inconsistently with hvmloader (QEMU uses hardcode
> > 0xe0000000 in pc_init1 and xen_ram_init). I created a virtual device
> > with 1GB mmio size to debug this issue. Firstly, QEMU set memory
> > regions except pci hole region in pc_init1() and xen_ram_init(), then
> > hvmloader calculated pci_mem_start as 0x80000000, and wrote it to TOM
> > register, which triggered QEMU to update pci hole region with
> > 0x80000000 using i440fx_update_pci_mem_hole(). Finally the windows 7 VM
> > (configured 8G) crashed with BSOD code 0x00000024. If I hardcode in
> > QEMU pc_init1 and xen_ram_init to match hvmloader's. Then the problem
> > was gone.
> > >
> > > Althrough above two patches will pass actual pci hole start address
> > to QEMU, but it's too late, QEMU pc_init1() and xen_ram_init() already
> > set the other memory regions, and obviously the pci hole might overlap
> > with ram regions in this case. So I think hvmloader should setup pci
> > devices and calculate pci hole first, then QEMU can map memory regions
> > correctly from the beginning.
> > >
> > 
> > Thank you very much for your detailed analysis of the problem.
> > 
> > After reading this, I wonder how is possible that qemu-xen-traditional
> > does not have this issue, considering that AFAIK there is no way for
> > hvmloader to tell qemu-xen-traditional where the PCI hole starts.
> > 
> > The only difference between upstream QEMU and qemu-xen-traditional is
> > that the former would start the PCI hole at 0xf0000000 while the latter
> > would start the PCI hole at 0xe0000000.
> > 
> > So I would expect that your test, where hvmloader is updating the PCI
> > hole region to start at 0x80000000, would fail on qemu-xen-traditional
> > too.
> 
> Yes, I think so. 
> 
> > 
> > Of course having the PCI hole starting unconditionally at 0xf0000000
> > makes it much easier to run into problems than starting it at
> > 0xe0000000.
> > 
> > 
> > Assuming that everything above is correct, this is what I would do:
> > 
> > 1) modify upstream QEMU to start the PCI hole at 0xe0000000, to match
> > qemu-xen-unstable in terms of configuration and not to introduce any
> > regressions. Do this for the Xen 4.3 release.
> 
> It's a quick improvement before implementing a thorough solution.

Cool.
Can you confirm that the following patch solves your original problem?


diff --git a/xen-all.c b/xen-all.c
index daf43b9..259f862 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -35,6 +35,9 @@
     do { } while (0)
 #endif
 
+#define QEMU_XEN_BELOW_4G_RAM_END       0xe0000000
+#define QEMU_XEN_BELOW_4G_MMIO_LENGTH   ((1ULL << 32) - QEMU_XEN_BELOW_4G_RAM_END)
+
 static MemoryRegion ram_memory, ram_640k, ram_lo, ram_hi;
 static MemoryRegion *framebuffer;
 static bool xen_in_migration;
@@ -160,18 +163,18 @@ static void xen_ram_init(ram_addr_t ram_size)
     ram_addr_t block_len;
 
     block_len = ram_size;
-    if (ram_size >= HVM_BELOW_4G_RAM_END) {
+    if (ram_size >= QEMU_XEN_BELOW_4G_RAM_END) {
         /* Xen does not allocate the memory continuously, and keep a hole at
-         * HVM_BELOW_4G_MMIO_START of HVM_BELOW_4G_MMIO_LENGTH
+         * QEMU_XEN_BELOW_4G_RAM_END of QEMU_XEN_BELOW_4G_MMIO_LENGTH
          */
-        block_len += HVM_BELOW_4G_MMIO_LENGTH;
+        block_len += QEMU_XEN_BELOW_4G_MMIO_LENGTH;
     }
     memory_region_init_ram(&ram_memory, "xen.ram", block_len);
     vmstate_register_ram_global(&ram_memory);
 
-    if (ram_size >= HVM_BELOW_4G_RAM_END) {
-        above_4g_mem_size = ram_size - HVM_BELOW_4G_RAM_END;
-        below_4g_mem_size = HVM_BELOW_4G_RAM_END;
+    if (ram_size >= QEMU_XEN_BELOW_4G_RAM_END) {
+        above_4g_mem_size = ram_size - QEMU_XEN_BELOW_4G_RAM_END;
+        below_4g_mem_size = QEMU_XEN_BELOW_4G_RAM_END;
     } else {
         below_4g_mem_size = ram_size;
     }
[Part 2 (text/plain, inline)]

From: Hanweidong <hanweidong@huawei.com>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Anthony Perard <anthony.perard@citrix.com>, "xudong.hao@intel.com" <xudong.hao@intel.com>, Yanqiangjun <yanqiangjun@huawei.com>, "Gonglei \(Arei\)" <arei.gonglei@huawei.com>, "xiantao.zhang@intel.com" <xiantao.zhang@intel.com>, Luonengjun <luonengjun@huawei.com>, Wangzhenguo <wangzhenguo@huawei.com>, Yangxiaowei <xiaowei.yang@huawei.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, George Dunlap <George.Dunlap@eu.citrix.com>
Subject: Re: [Xen-devel] GPU passthrough issue when VM is configured with 4G memoryo
Date: Thu, 30 May 2013 10:45:54 +0000
Message-ID: <FAB5C136CA8BEA4DBEA2F641E3F5363863BB158D@szxeml538-mbx.china.huawei.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

> -----Original Message-----
> From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> Sent: 2013年5月30日 18:28
> To: Hanweidong
> Cc: Stefano Stabellini; George Dunlap; xudong.hao@intel.com;
> Yanqiangjun; Luonengjun; Wangzhenguo; Yangxiaowei; Gonglei (Arei);
> Anthony Perard; xen-devel@lists.xen.org; xiantao.zhang@intel.com
> Subject: RE: [Xen-devel] GPU passthrough issue when VM is configured
> with 4G memoryo
> 
> On Thu, 30 May 2013, Hanweidong wrote:
> > > -----Original Message-----
> > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > Sent: 2013年5月30日 0:18
> > > To: Hanweidong
> > > Cc: Stefano Stabellini; George Dunlap; xudong.hao@intel.com;
> > > Yanqiangjun; Luonengjun; Wangzhenguo; Yangxiaowei; Gonglei (Arei);
> > > Anthony Perard; xen-devel@lists.xen.org; xiantao.zhang@intel.com
> > > Subject: RE: [Xen-devel] GPU passthrough issue when VM is
> configured
> > > with 4G memory
> > >
> > > On Thu, 25 Apr 2013, Hanweidong wrote:
> > > > > -----Original Message-----
> > > > > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > > > > bounces@lists.xen.org] On Behalf Of Hanweidong
> > > > > Sent: 2013年3月26日 17:38
> > > > > To: Stefano Stabellini
> > > > > Cc: George Dunlap; xudong.hao@intel.com; Yanqiangjun;
> Luonengjun;
> > > > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > > > > devel@lists.xen.org; xiantao.zhang@intel.com
> > > > > Subject: Re: [Xen-devel] GPU passthrough issue when VM is
> > > configured
> > > > > with 4G memory
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Stefano Stabellini
> [mailto:stefano.stabellini@eu.citrix.com]
> > > > > > Sent: 2013年3月18日 20:02
> > > > > > To: Hanweidong
> > > > > > Cc: George Dunlap; Stefano Stabellini; Yanqiangjun;
> Luonengjun;
> > > > > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard;
> xen-
> > > > > > devel@lists.xen.org; xudong.hao@intel.com;
> > > xiantao.zhang@intel.com
> > > > > > Subject: RE: [Xen-devel] GPU passthrough issue when VM is
> > > configured
> > > > > > with 4G memory
> > > > > >
> > > > > > On Wed, 13 Mar 2013, Hanweidong wrote:
> > > > > > > MMIO HOLE was adjusted to e0000000 - fc000000. But QEMU
> uses
> > > below
> > > > > > code to init
> > > > > > > RAM in xen_ram_init:
> > > > > > >
> > > > > > >     ...
> > > > > > >     block_len = ram_size;
> > > > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > > > >         /* Xen does not allocate the memory continuously,
> and
> > > keep
> > > > > a
> > > > > > hole at
> > > > > > >          * HVM_BELOW_4G_MMIO_START of
> HVM_BELOW_4G_MMIO_LENGTH
> > > > > > >          */
> > > > > > >         block_len += HVM_BELOW_4G_MMIO_LENGTH;
> > > > > > >     }
> > > > > > >     memory_region_init_ram(&ram_memory, "xen.ram",
> block_len);
> > > > > > >     vmstate_register_ram_global(&ram_memory);
> > > > > > >
> > > > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > > > >         above_4g_mem_size = ram_size - HVM_BELOW_4G_RAM_END;
> > > > > > >         below_4g_mem_size = HVM_BELOW_4G_RAM_END;
> > > > > > >     } else {
> > > > > > >         below_4g_mem_size = ram_size;
> > > > > > >     }
> > > > > > >     ...
> > > > > > >
> > > > > > > HVM_BELOW_4G_RAM_END is f0000000. If we change
> > > HVM_BELOW_4G_RAM_END
> > > > > > to e0000000,
> > > > > > > Which it's consistent with hvmloader when assigning a GPU,
> and
> > > then
> > > > > > guest worked
> > > > > > > for us. So we wondering that xen_ram_init in QEMU should be
> > > > > > consistent with
> > > > > > > hvmloader.
> > > > > > >
> > > > > > > In addition, we found QEMU uses hardcode 0xe0000000 in
> > > pc_init1()
> > > > > as
> > > > > > below.
> > > > > > > Should keep these places handle the consistent mmio hole or
> not?
> > > > > > >
> > > > > > >     if (ram_size >= 0xe0000000 ) {
> > > > > > >         above_4g_mem_size = ram_size - 0xe0000000;
> > > > > > >         below_4g_mem_size = 0xe0000000;
> > > > > > >     } else {
> > > > > > >         above_4g_mem_size = 0;
> > > > > > >         below_4g_mem_size = ram_size;
> > > > > > >     }
> > > > > >
> > > > > > The guys at Intel sent a couple of patches recently to fix
> this
> > > issue:
> > > > > >
> > > > > > http://marc.info/?l=xen-devel&m=136150317011027
> > > > > > http://marc.info/?l=qemu-devel&m=136177475215360&w=2
> > > > > >
> > > > > > Do they solve your problem?
> > > > >
> > > > > These two patches didn't solve our problem.
> > > > >
> > > >
> > > > I debugged this issue with above two patches. I want to share
> some
> > > information and discuss solution here. This issue is actually
> caused by
> > > that a VM has a large pci hole (mmio size) which results in QEMU
> sets
> > > memory regions inconsistently with hvmloader (QEMU uses hardcode
> > > 0xe0000000 in pc_init1 and xen_ram_init). I created a virtual
> device
> > > with 1GB mmio size to debug this issue. Firstly, QEMU set memory
> > > regions except pci hole region in pc_init1() and xen_ram_init(),
> then
> > > hvmloader calculated pci_mem_start as 0x80000000, and wrote it to
> TOM
> > > register, which triggered QEMU to update pci hole region with
> > > 0x80000000 using i440fx_update_pci_mem_hole(). Finally the windows
> 7 VM
> > > (configured 8G) crashed with BSOD code 0x00000024. If I hardcode in
> > > QEMU pc_init1 and xen_ram_init to match hvmloader's. Then the
> problem
> > > was gone.
> > > >
> > > > Althrough above two patches will pass actual pci hole start
> address
> > > to QEMU, but it's too late, QEMU pc_init1() and xen_ram_init()
> already
> > > set the other memory regions, and obviously the pci hole might
> overlap
> > > with ram regions in this case. So I think hvmloader should setup
> pci
> > > devices and calculate pci hole first, then QEMU can map memory
> regions
> > > correctly from the beginning.
> > > >
> > >
> > > Thank you very much for your detailed analysis of the problem.
> > >
> > > After reading this, I wonder how is possible that qemu-xen-
> traditional
> > > does not have this issue, considering that AFAIK there is no way
> for
> > > hvmloader to tell qemu-xen-traditional where the PCI hole starts.
> > >
> > > The only difference between upstream QEMU and qemu-xen-traditional
> is
> > > that the former would start the PCI hole at 0xf0000000 while the
> latter
> > > would start the PCI hole at 0xe0000000.
> > >
> > > So I would expect that your test, where hvmloader is updating the
> PCI
> > > hole region to start at 0x80000000, would fail on qemu-xen-
> traditional
> > > too.
> >
> > Yes, I think so.
> >
> > >
> > > Of course having the PCI hole starting unconditionally at
> 0xf0000000
> > > makes it much easier to run into problems than starting it at
> > > 0xe0000000.
> > >
> > >
> > > Assuming that everything above is correct, this is what I would do:
> > >
> > > 1) modify upstream QEMU to start the PCI hole at 0xe0000000, to
> match
> > > qemu-xen-unstable in terms of configuration and not to introduce
> any
> > > regressions. Do this for the Xen 4.3 release.
> >
> > It's a quick improvement before implementing a thorough solution.
> 
> Cool.
> Can you confirm that the following patch solves your original problem?
> 

Actually I already modified code like your below patch. It worked for me when I passthrough one GPU whose mmio size is about 200MB.

There is hardcode 0xe0000000 in pc_init1() in pc_piix.c file. I suggest to replace it by QEMU_XEN_BELOW_4G_RAM_END. I think the memory layout calculation should be consistent between xen_ram_init() and pc_init1(). 

weidong
	
> 
> diff --git a/xen-all.c b/xen-all.c
> index daf43b9..259f862 100644
> --- a/xen-all.c
> +++ b/xen-all.c
> @@ -35,6 +35,9 @@
>      do { } while (0)
>  #endif
> 
> +#define QEMU_XEN_BELOW_4G_RAM_END       0xe0000000
> +#define QEMU_XEN_BELOW_4G_MMIO_LENGTH   ((1ULL << 32) -
> QEMU_XEN_BELOW_4G_RAM_END)
> +
>  static MemoryRegion ram_memory, ram_640k, ram_lo, ram_hi;
>  static MemoryRegion *framebuffer;
>  static bool xen_in_migration;
> @@ -160,18 +163,18 @@ static void xen_ram_init(ram_addr_t ram_size)
>      ram_addr_t block_len;
> 
>      block_len = ram_size;
> -    if (ram_size >= HVM_BELOW_4G_RAM_END) {
> +    if (ram_size >= QEMU_XEN_BELOW_4G_RAM_END) {
>          /* Xen does not allocate the memory continuously, and keep a
> hole at
> -         * HVM_BELOW_4G_MMIO_START of HVM_BELOW_4G_MMIO_LENGTH
> +         * QEMU_XEN_BELOW_4G_RAM_END of QEMU_XEN_BELOW_4G_MMIO_LENGTH
>           */
> -        block_len += HVM_BELOW_4G_MMIO_LENGTH;
> +        block_len += QEMU_XEN_BELOW_4G_MMIO_LENGTH;
>      }
>      memory_region_init_ram(&ram_memory, "xen.ram", block_len);
>      vmstate_register_ram_global(&ram_memory);
> 
> -    if (ram_size >= HVM_BELOW_4G_RAM_END) {
> -        above_4g_mem_size = ram_size - HVM_BELOW_4G_RAM_END;
> -        below_4g_mem_size = HVM_BELOW_4G_RAM_END;
> +    if (ram_size >= QEMU_XEN_BELOW_4G_RAM_END) {
> +        above_4g_mem_size = ram_size - QEMU_XEN_BELOW_4G_RAM_END;
> +        below_4g_mem_size = QEMU_XEN_BELOW_4G_RAM_END;
>      } else {
>          below_4g_mem_size = ram_size;
>      }
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Yangxiaowei <xiaowei.yang@huawei.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, George Dunlap <George.Dunlap@eu.citrix.com>, "xudong.hao@intel.com" <xudong.hao@intel.com>, Yanqiangjun <yanqiangjun@huawei.com>, Hanweidong <hanweidong@huawei.com>, Anthony Perard <anthony.perard@citrix.com>, "xiantao.zhang@intel.com" <xiantao.zhang@intel.com>, "Gonglei \(Arei\)" <arei.gonglei@huawei.com>, Wangzhenguo <wangzhenguo@huawei.com>, Luonengjun <luonengjun@huawei.com>
Subject: Re: [Xen-devel] GPU passthrough issue when VM is configured with 4G memory
Date: Mon, 3 Jun 2013 09:11:15 -0400
Message-ID: <20130603131115.GJ6893@phenom.dumpdata.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

On Wed, May 29, 2013 at 05:18:24PM +0100, Stefano Stabellini wrote:
> On Thu, 25 Apr 2013, Hanweidong wrote:
> > > -----Original Message-----
> > > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > > bounces@lists.xen.org] On Behalf Of Hanweidong
> > > Sent: 2013年3月26日 17:38
> > > To: Stefano Stabellini
> > > Cc: George Dunlap; xudong.hao@intel.com; Yanqiangjun; Luonengjun;
> > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > > devel@lists.xen.org; xiantao.zhang@intel.com
> > > Subject: Re: [Xen-devel] GPU passthrough issue when VM is configured
> > > with 4G memory
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > > Sent: 2013年3月18日 20:02
> > > > To: Hanweidong
> > > > Cc: George Dunlap; Stefano Stabellini; Yanqiangjun; Luonengjun;
> > > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > > > devel@lists.xen.org; xudong.hao@intel.com; xiantao.zhang@intel.com
> > > > Subject: RE: [Xen-devel] GPU passthrough issue when VM is configured
> > > > with 4G memory
> > > >
> > > > On Wed, 13 Mar 2013, Hanweidong wrote:
> > > > > MMIO HOLE was adjusted to e0000000 - fc000000. But QEMU uses below
> > > > code to init
> > > > > RAM in xen_ram_init:
> > > > >
> > > > >     ...
> > > > >     block_len = ram_size;
> > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > >         /* Xen does not allocate the memory continuously, and keep
> > > a
> > > > hole at
> > > > >          * HVM_BELOW_4G_MMIO_START of HVM_BELOW_4G_MMIO_LENGTH
> > > > >          */
> > > > >         block_len += HVM_BELOW_4G_MMIO_LENGTH;
> > > > >     }
> > > > >     memory_region_init_ram(&ram_memory, "xen.ram", block_len);
> > > > >     vmstate_register_ram_global(&ram_memory);
> > > > >
> > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > >         above_4g_mem_size = ram_size - HVM_BELOW_4G_RAM_END;
> > > > >         below_4g_mem_size = HVM_BELOW_4G_RAM_END;
> > > > >     } else {
> > > > >         below_4g_mem_size = ram_size;
> > > > >     }
> > > > >     ...
> > > > >
> > > > > HVM_BELOW_4G_RAM_END is f0000000. If we change HVM_BELOW_4G_RAM_END
> > > > to e0000000,
> > > > > Which it's consistent with hvmloader when assigning a GPU, and then
> > > > guest worked
> > > > > for us. So we wondering that xen_ram_init in QEMU should be
> > > > consistent with
> > > > > hvmloader.
> > > > >
> > > > > In addition, we found QEMU uses hardcode 0xe0000000 in pc_init1()
> > > as
> > > > below.
> > > > > Should keep these places handle the consistent mmio hole or not?
> > > > >
> > > > >     if (ram_size >= 0xe0000000 ) {
> > > > >         above_4g_mem_size = ram_size - 0xe0000000;
> > > > >         below_4g_mem_size = 0xe0000000;
> > > > >     } else {
> > > > >         above_4g_mem_size = 0;
> > > > >         below_4g_mem_size = ram_size;
> > > > >     }
> > > >
> > > > The guys at Intel sent a couple of patches recently to fix this issue:
> > > >
> > > > http://marc.info/?l=xen-devel&m=136150317011027
> > > > http://marc.info/?l=qemu-devel&m=136177475215360&w=2
> > > >
> > > > Do they solve your problem?
> > > 
> > > These two patches didn't solve our problem.
> > > 
> > 
> > I debugged this issue with above two patches. I want to share some information and discuss solution here. This issue is actually caused by that a VM has a large pci hole (mmio size) which results in QEMU sets memory regions inconsistently with hvmloader (QEMU uses hardcode 0xe0000000 in pc_init1 and xen_ram_init). I created a virtual device with 1GB mmio size to debug this issue. Firstly, QEMU set memory regions except pci hole region in pc_init1() and xen_ram_init(), then hvmloader calculated pci_mem_start as 0x80000000, and wrote it to TOM register, which triggered QEMU to update pci hole region with 0x80000000 using i440fx_update_pci_mem_hole(). Finally the windows 7 VM (configured 8G) crashed with BSOD code 0x00000024. If I hardcode in QEMU pc_init1 and xen_ram_init to match hvmloader's. Then the problem was gone. 
> > 
> > Althrough above two patches will pass actual pci hole start address to QEMU, but it's too late, QEMU pc_init1() and xen_ram_init() already set the other memory regions, and obviously the pci hole might overlap with ram regions in this case. So I think hvmloader should setup pci devices and calculate pci hole first, then QEMU can map memory regions correctly from the beginning.  
> > 
> 
> Thank you very much for your detailed analysis of the problem.
> 
> After reading this, I wonder how is possible that qemu-xen-traditional
> does not have this issue, considering that AFAIK there is no way for
> hvmloader to tell qemu-xen-traditional where the PCI hole starts.
> 
> The only difference between upstream QEMU and qemu-xen-traditional is
> that the former would start the PCI hole at 0xf0000000 while the latter
> would start the PCI hole at 0xe0000000.
> 
> So I would expect that your test, where hvmloader is updating the PCI
> hole region to start at 0x80000000, would fail on qemu-xen-traditional
> too.
> 
> Of course having the PCI hole starting unconditionally at 0xf0000000
> makes it much easier to run into problems than starting it at
> 0xe0000000.
> 
> 
> Assuming that everything above is correct, this is what I would do:
> 
> 1) modify upstream QEMU to start the PCI hole at 0xe0000000, to match
> qemu-xen-unstable in terms of configuration and not to introduce any
> regressions. Do this for the Xen 4.3 release.
> 
> 2) for Xen 4.4 rework the two patches above and improve
> i440fx_update_pci_mem_hole: resizing the pci_hole subregion is not
> enough, it also needs to be able to resize the system memory region
> (xen.ram) to make room for the bigger pci_hole


Would that make migration more difficult - meaning if you have now two
different QEMU versions where the PCI hole is different on them? Or is
that not an issue and QEMU handles setting the layout nicely? Or is
the 0xe0000000 the norm in Xen 4.1, and Xen 4.2?

I am assuming you unplug the PCI device before you migrate of course.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Yangxiaowei <xiaowei.yang@huawei.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, "xiantao.zhang@intel.com" <xiantao.zhang@intel.com>, "Gonglei \(Arei\)" <arei.gonglei@huawei.com>, Wangzhenguo <wangzhenguo@huawei.com>, Luonengjun <luonengjun@huawei.com>, "xudong.hao@intel.com" <xudong.hao@intel.com>, Yanqiangjun <yanqiangjun@huawei.com>, Hanweidong <hanweidong@huawei.com>, Anthony Perard <anthony.perard@citrix.com>
Subject: Re: [Xen-devel] GPU passthrough issue when VM is configured with 4G memory
Date: Mon, 3 Jun 2013 16:14:03 +0100
Message-ID: <alpine.DEB.2.02.1306031611570.4589@kaball.uk.xensource.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

[Part 1 (text/plain, inline)]
On Mon, 3 Jun 2013, Konrad Rzeszutek Wilk wrote:
> On Wed, May 29, 2013 at 05:18:24PM +0100, Stefano Stabellini wrote:
> > On Thu, 25 Apr 2013, Hanweidong wrote:
> > > > -----Original Message-----
> > > > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> > > > bounces@lists.xen.org] On Behalf Of Hanweidong
> > > > Sent: 2013年3月26日 17:38
> > > > To: Stefano Stabellini
> > > > Cc: George Dunlap; xudong.hao@intel.com; Yanqiangjun; Luonengjun;
> > > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > > > devel@lists.xen.org; xiantao.zhang@intel.com
> > > > Subject: Re: [Xen-devel] GPU passthrough issue when VM is configured
> > > > with 4G memory
> > > > 
> > > > 
> > > > > -----Original Message-----
> > > > > From: Stefano Stabellini [mailto:stefano.stabellini@eu.citrix.com]
> > > > > Sent: 2013年3月18日 20:02
> > > > > To: Hanweidong
> > > > > Cc: George Dunlap; Stefano Stabellini; Yanqiangjun; Luonengjun;
> > > > > Wangzhenguo; Yangxiaowei; Gonglei (Arei); Anthony Perard; xen-
> > > > > devel@lists.xen.org; xudong.hao@intel.com; xiantao.zhang@intel.com
> > > > > Subject: RE: [Xen-devel] GPU passthrough issue when VM is configured
> > > > > with 4G memory
> > > > >
> > > > > On Wed, 13 Mar 2013, Hanweidong wrote:
> > > > > > MMIO HOLE was adjusted to e0000000 - fc000000. But QEMU uses below
> > > > > code to init
> > > > > > RAM in xen_ram_init:
> > > > > >
> > > > > >     ...
> > > > > >     block_len = ram_size;
> > > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > > >         /* Xen does not allocate the memory continuously, and keep
> > > > a
> > > > > hole at
> > > > > >          * HVM_BELOW_4G_MMIO_START of HVM_BELOW_4G_MMIO_LENGTH
> > > > > >          */
> > > > > >         block_len += HVM_BELOW_4G_MMIO_LENGTH;
> > > > > >     }
> > > > > >     memory_region_init_ram(&ram_memory, "xen.ram", block_len);
> > > > > >     vmstate_register_ram_global(&ram_memory);
> > > > > >
> > > > > >     if (ram_size >= HVM_BELOW_4G_RAM_END) {
> > > > > >         above_4g_mem_size = ram_size - HVM_BELOW_4G_RAM_END;
> > > > > >         below_4g_mem_size = HVM_BELOW_4G_RAM_END;
> > > > > >     } else {
> > > > > >         below_4g_mem_size = ram_size;
> > > > > >     }
> > > > > >     ...
> > > > > >
> > > > > > HVM_BELOW_4G_RAM_END is f0000000. If we change HVM_BELOW_4G_RAM_END
> > > > > to e0000000,
> > > > > > Which it's consistent with hvmloader when assigning a GPU, and then
> > > > > guest worked
> > > > > > for us. So we wondering that xen_ram_init in QEMU should be
> > > > > consistent with
> > > > > > hvmloader.
> > > > > >
> > > > > > In addition, we found QEMU uses hardcode 0xe0000000 in pc_init1()
> > > > as
> > > > > below.
> > > > > > Should keep these places handle the consistent mmio hole or not?
> > > > > >
> > > > > >     if (ram_size >= 0xe0000000 ) {
> > > > > >         above_4g_mem_size = ram_size - 0xe0000000;
> > > > > >         below_4g_mem_size = 0xe0000000;
> > > > > >     } else {
> > > > > >         above_4g_mem_size = 0;
> > > > > >         below_4g_mem_size = ram_size;
> > > > > >     }
> > > > >
> > > > > The guys at Intel sent a couple of patches recently to fix this issue:
> > > > >
> > > > > http://marc.info/?l=xen-devel&m=136150317011027
> > > > > http://marc.info/?l=qemu-devel&m=136177475215360&w=2
> > > > >
> > > > > Do they solve your problem?
> > > > 
> > > > These two patches didn't solve our problem.
> > > > 
> > > 
> > > I debugged this issue with above two patches. I want to share some information and discuss solution here. This issue is actually caused by that a VM has a large pci hole (mmio size) which results in QEMU sets memory regions inconsistently with hvmloader (QEMU uses hardcode 0xe0000000 in pc_init1 and xen_ram_init). I created a virtual device with 1GB mmio size to debug this issue. Firstly, QEMU set memory regions except pci hole region in pc_init1() and xen_ram_init(), then hvmloader calculated pci_mem_start as 0x80000000, and wrote it to TOM register, which triggered QEMU to update pci hole region with 0x80000000 using i440fx_update_pci_mem_hole(). Finally the windows 7 VM (configured 8G) crashed with BSOD code 0x00000024. If I hardcode in QEMU pc_init1 and xen_ram_init to match hvmloader's. Then the problem was gone. 
> > > 
> > > Althrough above two patches will pass actual pci hole start address to QEMU, but it's too late, QEMU pc_init1() and xen_ram_init() already set the other memory regions, and obviously the pci hole might overlap with ram regions in this case. So I think hvmloader should setup pci devices and calculate pci hole first, then QEMU can map memory regions correctly from the beginning.  
> > > 
> > 
> > Thank you very much for your detailed analysis of the problem.
> > 
> > After reading this, I wonder how is possible that qemu-xen-traditional
> > does not have this issue, considering that AFAIK there is no way for
> > hvmloader to tell qemu-xen-traditional where the PCI hole starts.
> > 
> > The only difference between upstream QEMU and qemu-xen-traditional is
> > that the former would start the PCI hole at 0xf0000000 while the latter
> > would start the PCI hole at 0xe0000000.
> > 
> > So I would expect that your test, where hvmloader is updating the PCI
> > hole region to start at 0x80000000, would fail on qemu-xen-traditional
> > too.
> > 
> > Of course having the PCI hole starting unconditionally at 0xf0000000
> > makes it much easier to run into problems than starting it at
> > 0xe0000000.
> > 
> > 
> > Assuming that everything above is correct, this is what I would do:
> > 
> > 1) modify upstream QEMU to start the PCI hole at 0xe0000000, to match
> > qemu-xen-unstable in terms of configuration and not to introduce any
> > regressions. Do this for the Xen 4.3 release.
> > 
> > 2) for Xen 4.4 rework the two patches above and improve
> > i440fx_update_pci_mem_hole: resizing the pci_hole subregion is not
> > enough, it also needs to be able to resize the system memory region
> > (xen.ram) to make room for the bigger pci_hole
> 
> 
> Would that make migration more difficult - meaning if you have now two
> different QEMU versions where the PCI hole is different on them? Or is
> that not an issue and QEMU handles setting the layout nicely? Or is
> the 0xe0000000 the norm in Xen 4.1, and Xen 4.2?
>
> I am assuming you unplug the PCI device before you migrate of course.


the change in configuration is only for qemu-xen and upstream QEMU and
Xen 4.3 is the first release that defaults to it, so I don't think we
need to maintain save/restore compatibility yet. But from the next one
is going to be unavoidable.
[Part 2 (text/plain, inline)]

From: Pasi Kärkkäinen <pasik@iki.fi>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Yangxiaowei <xiaowei.yang@huawei.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Anthony Perard <anthony.perard@citrix.com>, Hanweidong <hanweidong@huawei.com>, Yanqiangjun <yanqiangjun@huawei.com>, "xudong.hao@intel.com" <xudong.hao@intel.com>, Wangzhenguo <wangzhenguo@huawei.com>, Luonengjun <luonengjun@huawei.com>, "xiantao.zhang@intel.com" <xiantao.zhang@intel.com>, "Gonglei \(Arei\)" <arei.gonglei@huawei.com>
Subject: Re: [Xen-devel] GPU passthrough issue when VM is configured with 4G memory / Xen 4.4
Date: Thu, 26 Sep 2013 23:09:49 +0300
Message-ID: <20130926200948.GP2924@reaktio.net>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

Hello,

George: I think this one is missing from the Xen 4.4 status emails? (see below)

On Wed, May 29, 2013 at 05:18:24PM +0100, Stefano Stabellini wrote:
> 
> Thank you very much for your detailed analysis of the problem.
> 
> After reading this, I wonder how is possible that qemu-xen-traditional
> does not have this issue, considering that AFAIK there is no way for
> hvmloader to tell qemu-xen-traditional where the PCI hole starts.
> 
> The only difference between upstream QEMU and qemu-xen-traditional is
> that the former would start the PCI hole at 0xf0000000 while the latter
> would start the PCI hole at 0xe0000000.
> 
> So I would expect that your test, where hvmloader is updating the PCI
> hole region to start at 0x80000000, would fail on qemu-xen-traditional
> too.
> 
> Of course having the PCI hole starting unconditionally at 0xf0000000
> makes it much easier to run into problems than starting it at
> 0xe0000000.
> 
> 
> Assuming that everything above is correct, this is what I would do:
> 
> 1) modify upstream QEMU to start the PCI hole at 0xe0000000, to match
> qemu-xen-unstable in terms of configuration and not to introduce any
> regressions. Do this for the Xen 4.3 release.
> 
> 2) for Xen 4.4 rework the two patches above and improve
> i440fx_update_pci_mem_hole: resizing the pci_hole subregion is not
> enough, it also needs to be able to resize the system memory region
> (xen.ram) to make room for the bigger pci_hole


I think this second part hasn't been done/fixed yet? 
Feel free to correct me if it has been done already :) 

Thanks,

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

From: Ian Campbell <Ian.Campbell@citrix.com>
To: xen-devel <xen-devel@lists.xen.org>
Cc: Stefano Stabellini <stefano.stabellini@citrix.com>, Anthony Perard <anthony.perard@citrix.com>
Subject: [Xen-devel] support PCI hole resize in qemu-xen
Date: Wed, 8 Jan 2014 12:22:56 +0000
Message-ID: <1389183776.4883.42.camel@kazak.uk.xensource.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

create <alpine.DEB.2.02.1305291701580.4799@kaball.uk.xensource.com>
title it support PCI hole resize in qemu-xen
thanks

We took a workaround late in 4.3 for this and planned to fix it properly
for 4.4, but we seem to have forgotten. I think it is probably now also
too late for 4.4. I've created a bug in the hope thast we can fix this
for 4.5.

I struggled to find a good reference for this (old) issue the comments
at http://bugs.xenproject.org/xen/mid/%
3Calpine.DEB.2.02.1305291701580.4799@kaball.uk.xensource.com%3E seem
like a good point in that massive thread.

The whole thing is at:
http://www.gossamer-threads.com/lists/engine?do=post_view_flat;post=273750;page=1;mh=-1;list=xen;sb=post_latest_reply;so=ASC

I suspect there were other relevant threads around the time too.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


From: Ian Campbell <Ian.Campbell@citrix.com>
To: Anthony PERARD <anthony.perard@citrix.com>
Cc: ian.jackson@eu.citrix.com, George Dunlap <George.Dunlap@eu.citrix.com>, Stefano Stabellini <stefano.stabellini@citrix.com>, Jan Beulich <JBeulich@suse.com>, xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [Xen-devel] [xen-unstable test] 24250: tolerable FAIL
Date: Wed, 8 Jan 2014 13:04:43 +0000
Message-ID: <1389186283.4883.61.camel@kazak.uk.xensource.com>

[ Reply to this message; Retrieve Raw Message; Archives: gmane, marc.info ]

owner 28 Anthony PERARD <anthony.perard@citrix.com>
thanks
On Wed, 2014-01-08 at 12:55 +0000, Anthony PERARD wrote:
> I'll try to reproduce the issue.

Thanks!

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel